Closed sgagliano closed 6 years ago
I could see a use case for this. The API docs suggest that we do have rsID information in our database, though I can't seem to find it in the response payloads for any given SNP...
That said- our API server isn't the only way to get data into LZjs- you can use your own custom datasource if you have your own project using LZjs.
I think the basic tooltip link and maybe some visual convenience would be feasible to add by using existing features- let me know how I could help?
We would need to implement the GWAS catalog API and database tables, which is a small amount of effort since the data is so small.
If I recall correctly, the reason we needed to mirror this was because the EBI GWAS catalog API did not support region queries, which we need for LZ.
Sarah notes they would like to have this for their paper, but sounds to be up in the air on when they would like to submit. This of course has to take a back seat to the portal priorities.
Hey Ryan,
I really would love to see these annotations enabled and I would be happy to bump them above some of the portal priorities. :)
I am not sure if we should implement an API direct to the GWAS catalog, since their version of the data often has a lot of junk mixed in. It may be easier for us to create our own hits table and that would let us (for example) annotate UK Biobank peaks or some other interesting results of our choosing.
Goncalo
On Mon, Feb 26, 2018 at 2:10 PM, Ryan Welch notifications@github.com wrote:
We would need to implement the GWAS catalog API and database tables, which is a small amount of effort since the data is so small.
If I recall correctly, the reason we needed to mirror this was because the EBI GWAS catalog API did not support region queries, which we need for LZ.
Sarah notes they would like to have this for their paper, but sounds to be up in the air on when they would like to submit. This of course has to take a back seat to the portal priorities.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/locuszoom/issues/124#issuecomment-368613015, or mute the thread https://github.com/notifications/unsubscribe-auth/ABCoUNY8tDvCGdaVogj60S0UvSG-VAGqks5tYwDwgaJpZM4STa3a .
Spoke to Sarah a bit more, and it seems there are two approaches.
The "easy" version of this feature would be for our API to return rsID alongside each variant's other data (chrom/pos/ref/alt), or "null" if the variant had no rsID. This would answer the question "are ny of these SNPs known in the catalog at all" and would be enough for us mark "catalog variants" in the plot & put links in tooltips.
The more labor-intensive version would be to link the catalog to specific traits ("show me SNPS in this region that have been associated with Alzheimer's").
It's possible this could be done in two stages, depending on just how much a complexity jump is involved for the second version.
Can we wireframe this? We might be thinking about different things. This is what I imagine:
[image: Inline image 1] In my mind, we would like to have a table of hits that can be queried by (e.g.) position and returns chr, position, label, URI
Then, each matching SNP in the region gets an arrow and the corresponding label. The labels could work just like the PheWas labels.
Goncalo
On Mon, Feb 26, 2018 at 2:45 PM, Andy Boughton notifications@github.com wrote:
Spoke to Sarah a bit more, and it seems there are two approaches.
The "easy" version of this feature would be for our API to return rsID alongside each variant's other data (chrom/pos/ref/alt), or "null" if the variant had no rsID. This would answer the question "are these SNPs known in the catalog at all" and would be enough for us mark "catalog variants" in the plot & put links in tooltips.
The more labor-intensive version would be to link the catalog to specific traits ("show me SNPS in this region that have been associated with Alzheimer's").
It's possible this could be done in two stages, depending on just how much a complexity jump is involved for the second version.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/statgen/locuszoom/issues/124#issuecomment-368625744, or mute the thread https://github.com/notifications/unsubscribe-auth/ABCoUPW6H4ROXV2OSy89-JYYugjyoaSCks5tYwnUgaJpZM4STa3a .
rsID does not determine whether the variant is in the EBI GWAS catalog, only whether the variant is known by dbSNP. On a sidenote: we do have rsIDs available via /annotation/snps/
and /annotation/snps/results/
from dbSNP.
Completely agree that we do not want to have a passthrough API to their own, as their catalog data requires cleaning and tidying before use. It actually isn't even an option, since they don't support region queries.
Overall, it sounds like this feature just follows the typical cycle:
/annotation/gwas
? The GWAS Catalog is on build 38, so need to keep that in mind when matching by chr:pos. Since Gonçalo mentioned returning the URI, I wanted to point out that EFO mapped trait annotations are in v1.0.1 (but not v1.0).
Ryan, Andy- let me know how I can help with this.
That is a good point. Luckily we have data for most endpoints in both GRCh37 and 38, so we could start with 38 and map back to 37 later.
We should match on chr:pos_ref/alt I'm guessing, but I can't recall off the top of my head if EBI GWAS catalog provides REF/ALT alleles, or just effect/non-effect alleles... (updated comment above)
I believe the Catalog provides the risk/effect allele. So matching by ref & alt alleles won't be straightforward.
@sgagliano I think Goncalo wanted UKBB GWAS hits as well. Do you have those?
@abecasis (or anyone) - How likely is it someone would want to see GWAS hits from previous catalogs? Or would they just want always the latest catalog? Need to know whether to support the same catalog over multiple revisions or just store the latest.
@abecasis Regarding UKBB hits, should we use HRC-imputed UKBB GWAS results? Presumably, not yet TOPMed-imputed?
oMy comments on this thread:
I think we should imagine a table that is flexible and simple enough to maintain.
I am not sure if we would need old GWAS catalog iterations, but we certainly would like multiple sources of results (UKBB vs GWASdb for example) and perhaps those multiple versions would be managed the same way.
I wonder if matching by build:chr:position suffices for our purpose, which is to show nearby relevant hits.
Ultimately, I think it would be nice to have a tag that says something like "UKBB: Height" or "GWASdb: Foot-length" and a tooltip that shows some more details including (if we can) a link to source.
Goncalo
On Thu, Mar 15, 2018 at 1:05 PM, sgagliano notifications@github.com wrote:
@abecasis https://github.com/abecasis Regarding UKBB hits, should we use HRC-imputed UKBB GWAS results? Presumably, not yet TOPMed-imputed?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/statgen/locuszoom/issues/124#issuecomment-373450989, or mute the thread https://github.com/notifications/unsubscribe-auth/ABCoUN5cWiBJZuTohGomjy7PPNo4weZUks5tep9KgaJpZM4STa3a .
This is mostly done for the EBI catalog now. Data is in the database and there is an API endpoint for it, documented below (but you will need to replace api
with api_internal_dev
for now, as it's not deployed to production yet.)
http://portaldev.sph.umich.edu/docs/api/v1/#gwas-catalogs
Take a look and see if this fits your needs and let me know if any modifications are needed. If it looks fine, I'll deploy it. Please do not use the dev endpoint in production.
The UKBB hits are in the process of parsing - the data needed some extra steps that were already done with the EBI catalog.
Thank you Ryan! @pjvandehaar - could you please have a look to see if this can be integrated into the LZ in PheWeb or if modifications are needed.
As mentioned to Sarah, I'm available for any questions on new visualization options in LZ.js as well. (depending on how you want to display this info) It never hurts to get features used in the wild and find ways to improve. :)
Forgot to mention last week - UKBB hits are available now too, for both GRCh37 and 38.
Thank you Ryan! Just to clarify, the UKBB "hits" are defined as variants that reached genome-wide significance (p<5E-8) in the HRC-imputed analysis for any of the 1400 phecodes?
Exactly. If there's a more stringent threshold to use given 1400 traits were tested, let me know and I can filter down further.
Following up: @welchr , is this ticket still active based on the API work? If not, is it safe to close?
Is any additional support needed on my end to make this ticket a reality?
It's merged into master on the API/DB side. I think @pjvandehaar needs to try incorporating it into PheWeb, and then file additional issues if visualization methods other than tooltips are necessary for displaying the information.
Thanks! Because we're awaiting user feedback to verify, I'll keep this ticket open for now. (just going through to weed open issues on the tracker. Feel free to close once the DB work is accepted)
Attaching a screenshot from initial experiments, to use as a prop during a scheduled discussion with Sarah today.
This demonstrates two options:
Other display options, like tables, are also a possibility.
The associated PR was merged and is awaiting pheweb integration. Closing this ticket accordingly.
Have an option to highlight variants that appear in the GWAS Catalog in the LZ plot, and if possible also have a way (maybe a toggle box) to see with which trait(s) that variant is associated with in the GWAS Catalog. Purpose of this feature: it would provide a quick visual way to identify whether the locus is known to be associated with a trait of interest, even if the reference variant itself is not known to be associated.