seermedical / seer-py

Python SDK for the Seer data platform
MIT License
27 stars 10 forks source link

Migrate across to resources schema #137

Open norrishd opened 3 years ago

norrishd commented 3 years ago

I don't feel like there's any rush on this, but it's something that would certainly be valuable to do.

ronrest commented 3 years ago

I think this will be more than just a patch. I think this will lead to breaking changes in the structure of the outputs returned.

I think this could probably lead to a major version update. But, i think this provides a good opportunity for us to rethink a lot of the structure of the API and also the outputs. Especially since we are still technically in pre-alpha stage.

Some suggestions

API

OUTPUTS

Could I suggest we rethink the column names, they tend to be a little ugly. Some suggestions:

norrishd commented 3 years ago

Yeah, we would for sure have to go to lengths to reshape the outputs to the current structure. I think this would be possible/manageable (especially if we migrated a few functions at a time), but I'm definitely open to the possibility of deciding that it's worth migrating to the new schema but not trying to preserve the exact current structure.

100% agree with you on doing away with limit. I'm still unsure if we should even be exposing it, or optimising and setting it ourselves within each function. Or maybe it could be a general param on whatever the resource schema equivalent of get_paginated_response() ends up being.

I'm on the fence with those suggested changes to the outputs - for anyone who is familiar with the schema, the current approach (using .s) makes it reliable to match a query up with the resulting dataframe, and means we don't need to have custom renaming in each function. It kind of boils down to this philosophical dilemma I feel is inherent to seer-py of "is this primarily a Python wrapper to a GQL endpoint, or a REST API on top of that endpoint?". If the answer is the former, IMO there are too many query functions defined as is/not enough organisation to them. If the answer is the latter, I think it doesn't currently abstract away enough, and is missing some obvious functionality, e.g. a simple download_study() method.

dkeden commented 3 years ago

If we have no intention of using seer-py with the exception of the execute_query method, then sure, there's no rush in migrating to the new resource schema. Otherwise, I would argue that it's something to consider sooner rather than later?

I would agree that this would be coupled with a restructure.

On another note, I think it's also worth considering who these changes would be for, e.g. external or internal users, etc., because the response would differ for each of these suggestions. A good point made by @norrishd re. Python wrapper vs REST API. Perhaps all for a larger discussion?