Closed welchr closed 3 years ago
@abought Recent commits also have:
catalog_version
-> version
for consistencyMissing:
date_inserted
would be difficult to include now since it isn't known for all data sources, started being added during GWAS catalog. Could possibly add it to everything in the future if it were useful. This is deployed on the dev server as well.
date_inserted
would be difficult to include now since it isn't known for all data sources, started being added during GWAS catalog. Could possibly add it to everything in the future if it were useful.
nods Backfilling would certainly be painful (though it could probably be done by hand from a list of dataset release timestamps... if you ever decide that I've sassed you one too many times and need to be assigned this task to teach me a lesson)
Other options would include making the field nullable, or a "start the better process going forward" plan: one some services we've gotten by with synthetic data, like using the current timestamp as the default value during DB migration. (newer datasets would receive newer timestamps in the future) Insofar as date inserted != date released anyway, one more fudge factor isn't the worst thing ever. 😛
By no means is this field mandatory- certainly for LocusZoom purposes, no human will ever see this field directly! It's a little bit of polish that wouldn't affect a release in my eyes, at all. If we do plan to add the field in the future, my only target would be to use consistent meta nomenclature across all endpoints. (it just makes the API nicer to use)
@abought How does deploy to production tomorrow (2/24) @ 8 PM EST sound? Friday evening or Sunday afternoon would also work well for me too.
@abought How does deploy to production tomorrow (2/24) @ 8 PM EST sound? Friday evening or Sunday afternoon would also work well for me too.
Tell me more about how days of the week are different. It sounds like a fascinating concept.
Any of those deploy times sound good! AFAIK, the migration should be fairly small and quick. Let me know if there are prep or standby activities that I can do to help it go smoothly.
Let's go with 2/24 @ 8 PM then. Should be pretty quick... 🤞
Synopsis
Gene, recombination rate, and GWAS catalog endpoints can be queried without supplying an id (or source in the case of genes.) LD will have to come in a future update to LDServer.
Instead, you can supply a build query parameter, and the API server will select the recommended dataset for that given build.
Currently this is done with a DB view that tracks the recommended id per build and “dataset”:
The view is automatically updated as new datasets are loaded.
The only notable difference in response for each query is a meta section:
This way, the client will know which dataset was selected by the API server and have information about it readily available, without having to execute a separate metadata query.
The reason for a list is because this happens always now, not just in the “use recommended ID” case. For the original use cases, the user could sometimes supply a filter like ‘id in 23,24,25’, and so there could be multiple datasets being returned in a single query.
Example queries
Only deployed on dev/staging currently.
Recombination
Genes
GWAS Catalog
Docs
Updated here: https://github.com/statgen/locuszoom-api/blob/feature/recommended-ids/docs/api.md
Deployment notes
Per Ryan, "DB changes are tracked in another repository": https://github.com/statgen/locuszoom-db/tree/feature/recommended-ids