Open norrishd opened 3 years ago
I think this will be more than just a patch. I think this will lead to breaking changes in the structure of the outputs returned.
I think this could probably lead to a major version update. But, i think this provides a good opportunity for us to rethink a lot of the structure of the API and also the outputs. Especially since we are still technically in pre-alpha
stage.
Some suggestions
API
limit
argument in paginated queries. It is confusing, and could be confused with max_items
.
page_limit
? Something that makes it clear that it is the limit per page, and not the total limit of items retrieved. OUTPUTS
Could I suggest we rethink the column names, they tend to be a little ugly. Some suggestions:
.
when using json_normalize()
for nested values. patient.user.fullName
, replacing it with patient_fullname
Yeah, we would for sure have to go to lengths to reshape the outputs to the current structure. I think this would be possible/manageable (especially if we migrated a few functions at a time), but I'm definitely open to the possibility of deciding that it's worth migrating to the new schema but not trying to preserve the exact current structure.
100% agree with you on doing away with limit
. I'm still unsure if we should even be exposing it, or optimising and setting it ourselves within each function. Or maybe it could be a general param on whatever the resource schema equivalent of get_paginated_response()
ends up being.
I'm on the fence with those suggested changes to the outputs - for anyone who is familiar with the schema, the current approach (using .
s) makes it reliable to match a query up with the resulting dataframe, and means we don't need to have custom renaming in each function. It kind of boils down to this philosophical dilemma I feel is inherent to seer-py
of "is this primarily a Python wrapper to a GQL endpoint, or a REST API on top of that endpoint?". If the answer is the former, IMO there are too many query functions defined as is/not enough organisation to them. If the answer is the latter, I think it doesn't currently abstract away enough, and is missing some obvious functionality, e.g. a simple download_study()
method.
If we have no intention of using seer-py with the exception of the execute_query
method, then sure, there's no rush in migrating to the new resource schema. Otherwise, I would argue that it's something to consider sooner rather than later?
I would agree that this would be coupled with a restructure.
On another note, I think it's also worth considering who these changes would be for, e.g. external or internal users, etc., because the response would differ for each of these suggestions. A good point made by @norrishd re. Python wrapper vs REST API. Perhaps all for a larger discussion?
I don't feel like there's any rush on this, but it's something that would certainly be valuable to do.