SuLab / GeneWikiCentral

GeneWiki Organization
MIT License
5 stars 2 forks source link

import clinvar #50

Open andrewsu opened 7 years ago

andrewsu commented 7 years ago

Clinvar is a public resource for information on human genetic variants. The FTP site is here: ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/, example variant record is here: https://www.ncbi.nlm.nih.gov/clinvar/variation/182956/

Example fields to import include clinical significance, review status, molecular consequence, variation type, etc.

Need to discuss what, if any, filtering would be done prior to import. For example, do we only restrict to interpretations that are > 1 star?

Are there any other technical or non-technical issues that we need to work through?

Also to note that Invitae has built their company on an open-data ethos, and has been a top 5 contributor to clinvar. I don't see an explicit license, but getting data directly from them could be an alternate route if there are any clinvar-specific issues: http://clinvitae.invitae.com/ Steve Lincoln at Invitae is a willing and interested contact.

andrewsu commented 7 years ago

cc @andrawaag @malachig @obigriffith -- thoughts welcome and appreciated...

andrawaag commented 6 years ago

We have a (very early) version on clinvar. https://www.wikidata.org/wiki/User:Variantbot/data_model#Clinvar

andrawaag commented 6 years ago

Is this still an issue. e.g. should we give it some attention?

andrewsu commented 6 years ago

I still think this would be a valuable resource to load, but having said that it's not currently on the critical path for anything in my lab. So I added the 'help wanted' label (code for potential intern/rotation project). Let's keep this ticket open...

andrewsu commented 5 years ago

We have a (very early) version on clinvar. https://www.wikidata.org/wiki/User:Variantbot/data_model#Clinvar

@andrawaag you deleted that clinvar data model section. Any reason why? Just want to determine if it's in the right state for an intern to pick this up...

andrawaag commented 5 years ago

That is good question. At the time we didn't have the bandwidth to work on that model. I think I removed from the new description page, because of that. It shouldn't be that difficult to resurrect that model.

andrawaag commented 5 years ago

I restructured the different models and distributed the three over designated tabs.