Closed hyi closed 2 years ago
This feature works fine except for the ICEES DILI instance. Specifically, when users incorrectly enter a year when there isn't a year variable, the service returns a response, albeit a weird response, instead of an error message. We could correct this by providing a default year of 2021, which is the year of the data pull.
@karafecho I think if a user provides a year and it's a year we don't have data for, returning an empty response is correct. Are you wanting to throw an error whenever they provide a year we don't have? If that's the case, instead of changing how we return values, I think we should add validation at the beginning of each route function call. @hyi
I agree that we should return an empty response when a user provides a year for which we don't have data. However, apart from the asthma datasets, none of the others contain a year field. So, the API has an optional year parameter, but any year a user enters will return an empty response.
I guess I'm thinking more long-term here, but I kind of think we should treat all datasets the same and include a year, whether it is the study period, the year of the data pull, etc.
That said, I should probably point out that the COVID data pulls are quarterly. Each pull is deployed as a separate API instance, so my comment above about adding a year parameter holds, but long-term, we should probably consider these things. For instance, it would be preferred to have all three quarterly pulls of COVID data deployed as a single API instance, with datasets selected by month/year. No action required here, however; just a consideration.
@karafecho Then the PR that addresses this issue should suffice for now since it returns an empty response if a user provides a year we don't have data for, or the data for the year don't have any associations between feature columns. This will make this implementation consistent with existing implementations for other endpoints such as get cohorts. If we feel throwing a validation error would be more appropriate in the future, we would need to change all endpoint implementation that return an empty response so all endpoint returns are consistent. @maximusunc what do you think?
I think we should be consistent with implementations AND with datasets. In the DILI case, this would mean adding a year variable (2021, the year the data were pulled).
So just to clarify, a user can give an optional year parameter, but if they provide anything, we return an empty response. So you suggest we just default it to 2021?
I think there are two ways to address the issue: (1) add a year variable to the DILI dataset (year 2021) or (2) default to 2021 in the API. With (2), this would mean (I think) that if a user enters any year other than 2021, then an empty response is returned.
I think the fact of the year variable is an optional parameter should indicate to the user to try without inputting any year to see what response they will get. And if they don't know the data, it is hard for them to input a valid year that matches the data. The empty response indicating empty query returned also should indicate to the user that the input year does not match the data. I don't think we should add any default in the API level. If the DILI data is indeed pulled in for year 2021, we should have year 2021 in the database. I guess the DILI csv file don't have 2021 year for the data, hence empty year in the database. How critical is this? If needed, I can manually update the sqlite DILI database to add 2021 year to all patient rows. But I think we should treat this as the last resort to do it only when it is critical.
@hyi +100 👍
If you, @hyi and @maximusunc , think that we should not deal with the year parameter for the DILI dataset, then we can consider this issue closed. However, I think this issue and #217 need to be considered moving forward.
@karafecho I am reopening this issue since I believe this PR https://github.com/NCATS-Tangerine/icees-api/pull/218 addresses it. Once @maximusunc approves the PR, I will update asthma and DILI development instances for you to test. You can close this issue after you get a chance to test it and verify it works including empty value handling.
I'm a little confused here. #218 doesn't seem to have anything to do with the optional year parameter. If the user provides a year we don't have data for, we should return an empty 200 response. My understanding is that the DILI data doesn't have any years in it, so anytime a user provides a year, we should return an empty response. In my opinion, this issue should be closed because we've added an optional year parameter to all feature endpoints.
@maximusunc Yes, you are correct that the PR #218 addresses this issue, but since the PR has not been merged and deployed for @karafecho to test, I think it'd be good to keep this issue open for a little while until @karafecho get a chance to test it and then close it if everything works.
Tested asthma and DILI dev instances, which seem to be working perfectly. Issue can be closed.
The optional year filtering should be added to the following five endpoints according to @karafecho: