ucsd-ccbb / VAPr

VAPr: A Python package for NoSQL variant data storage, annotation and prioritization
MIT License
34 stars 3 forks source link

Allow previously queried and stored variants to be shared across projects #19

Closed kmfisch closed 6 years ago

kmfisch commented 6 years ago

Reviewer recommendation: Configure the mongodb to allow previously queried, annotated and stored variants to be shared between projects, so new projects with the same variants will not result in new annovar/myvariant.info queries.

AmandaBirmingham commented 6 years ago

I am not convinced this would be a good functionality. I understand the goal (to cut down on unnecessary queries) but the assumption that repeated queries to myvariant.info for the same variant would in fact be unnecessary seems to be based on an incomplete understanding of myvariant.info's behavior.

Myvariant.info is updated live, on an on-going basis. This means that if you query myvariant.info for variant X today, you may very well get a different set of annotations than you got when you queried myvariant.info for variant X last week. It would be a terrible idea to update the database records for the results you got last week (as that would render last week's analysis un-reproducible) and it would also be a terrible idea to just use old annotations that were current as of last week while thinking you were using the most current annotations. It would be possible to add some sort of switch where a user could say they explicitly wanted to use old local annotations if they existed rather than going out to get fresh, new annotations, but that would require significant changes to the code base and, in my opinion, would mostly just introduce a way for naive users to get confused and use the functionality wrong.

I am willing to revisit this suggestion in the future if someone provides a specific use-case in which this behavior would solve a problem they are having.