chop-dbhi / varify

Clinical DNA Sequencing Analysis and Data Warehouse
BSD 2-Clause "Simplified" License
24 stars 8 forks source link

Integrate SolveBio Clinvar data in variant details view #39

Closed mitalia closed 10 years ago

mitalia commented 10 years ago

We need to implement a beta SolveBio integration using Clinvar as the demo data set. When a variant details window loads, we should issue a call out to solvebio in the following manner:

The specific call should be solvebio.data.ClinVar.variant_summary.select(...)

The query should proceed in the following manner trying until a record is found:

dbSNP ID (the rs number, note we need to strip the "rs" before querying) hgvs_c (note that from varify this string likely needs to be constructed from the isoform and our hgvs_c notation)

If there is a "hit" we should display: clinicalsignificance, guidelines, lastevaluated, numbersubmitters, origin, reviewstatus, type, rcvaccession

The significance should be visually highlighted. The rcvaccession should be rendered as a url link such as https://www.ncbi.nlm.nih.gov/clinvar/RCV000047050/

That link is also a good example of how clinvar renders this data. Much of what they have is redundant with what is in Varify, so I'm basically taking the additional elements that we don't have.

mitalia commented 10 years ago

I should add that eventually it would be nice to have a gene-level summary that lets people see all the clinvar variants for a gene, but until we have a coherent way to build up what we display in the details view, I'm loathe to clutter it too much.

mitalia commented 10 years ago

I'm moving this into our March cleanup release in the hopes we can demonstrate some progress on this. @davecap do you think we could roll out a clinvar prototype in the next few weeks to demo some integration with your stuff?

davecap commented 10 years ago

Yep definitely. We should have something for you to try in the next few days actually.

mitalia commented 10 years ago

I believe clinvar does not require genomic position (though if you guys were magically mapping what they do provide to genomic position reliably that would be sweet!). In my experience, the clinvar summary data is best queried by the HGVS mutation nomenclature. That's what we would use to find exact matches.

That said, I'd also like to provide a gene-level listing of other known clinvar variants for the gene, in the event our HGVS is slightly off (very likely given how hard it is to get HGVS right). So people can see other clinically relevant data. I'm picturing two implementation details:

1) Just displaying clinvar variants in the variant details page 2) Maintaining a local flag on our variant table that indicates which variants are in clinvar for rapid query and filtration purposes (this is a more longer-term item after we first get them rendering properly in the UI)

penningtonj commented 10 years ago

Our analysts currently roll just about everything up to the gene level when determining association with phenotype. Given this, it would be good to start with 'other variants (and hopefully reported phenotypes) reported on this gene', then get more specific over time. We don't have nailed down HGVS nomenclature yet, so linking based on that is fragile.

mitalia commented 10 years ago

@penningtonj with clinvar though, you really need to know whether or not you're looking at the exact same variant that has been called pathogenic. Providing all clinvar variants for the gene is OK, but not at the expense of having per-variant annotation which is why I want to have both out of the gate. It's not really any harder to do both. The queries are simple. Keep in mind that clinvar is likely to be full of "VUS" records which will make things a bit noisy when looking at the gene level.

Recognize our HGVS nomenclature isn't perfect, but keeping in mind that for single nucleotide changes it's 99% accurate, I suspect a high percentage of clinvar variants will map cleanly with that approach.

davecap commented 10 years ago

We've got basic ClinVar integration into the app via VariantResource. The query filters on chromosomal location, HGVS_c values, as well as the unique gene symbol list. I may deploy a sandbox node on AWS with our integration if you think that might be useful.

For now, here are some screenshots of what it looks like:

Results screen

screen shot 2014-03-20 at 11 27 00 am

Variant details with ClinVar

screen shot 2014-03-20 at 11 43 51 am

Variant details without ClinVar

screen shot 2014-03-20 at 11 44 00 am

mitalia commented 10 years ago

@davecap this is great stuff! I don't know that you need to deploy on an AWS sandbox. This is better than not having ClinVar, which is our current state. If you submit a pull request, we can take a look under the hood and sanity check the approach used and then if everything looks good, fold into the app.

As an aside-- we really need to redesign that Variant details UI. David is probably crying when he looks at it.

davecap commented 10 years ago

I can definitely submit a pull request. I was just having some trouble with the static assets... do you typically run "make clean" before committing? What do you do with the minified assets?

naegelyd commented 10 years ago

The static assets are in a bit of flux right now as I am trying to get them off coffeescript and into pure JS. You will definitely want to rebase against master since the result details modal has changed a bit since the format shown in the screenshots. We commit the minified assets with the other changes. If you run make clean && make before committing you should be good to go. Sorry about the state of the static folder. Like I said, it is slowly undergoing a major shift and is currently in a bit of a mess.

davecap commented 10 years ago

No problem thanks for letting me know!

davecap commented 10 years ago

Not quite ready for a pull request yet, still need to work out a couple of issues. Here's a commit with all the changes so far (up to date with your master branch): https://github.com/solvebio/varify/commit/2880cec0a1573f0759ef1a39b5f422a7c35f87f8

naegelyd commented 10 years ago

@davecap I'm guessing you already saw the Coffee to JS PR get merged but I figured I'd let you know. You should be able to rebase your branch on master now and port your changes over to JS.

naegelyd commented 10 years ago

Fixed in https://github.com/cbmi/varify/pull/193.