broadinstitute / gnomad-browser

Explore gnomAD datasets on the web
https://gnomad.broadinstitute.org
MIT License
81 stars 44 forks source link

Add flag to genes impacted by issue with GRCh38 reference #1352

Open ch-kr opened 11 months ago

ch-kr commented 11 months ago

False duplications in the GRCh38 reference have impacted variant calling in 3 medically relevant genes:

While previous studies concluded that variant calling performance is generally better on GRCh38 (refs. 34,35), our benchmark demonstrates that variant calls in some genes are less accurate on GRCh38 than GRCh37. Another group recently independently identified the importance of masking the additional copy of one gene (U2AF1/U2AF1L5) for cancer research36. Our results identify that false duplications cause many of the discrepancies found recently between exome variant calls on GRCh37 and GRCh38 (ref. 37). ... During this process, we also identified and resolved variant calling errors due to several false duplications in these medically relevant genes in GRCh38 on chromosome 21. Overall, 11 genes are impacted by these false duplications, including three medically relevant genes from our list (CBS, KCNE1 and CRYAA). As a solution to this problem, we provide a GRCh38 reference that masks the erroneous copy of the duplicated genes.

From https://www.nature.com/articles/s41587-021-01158-1.

Could we add a flag or warning to the gene pages for these three genes (CBS, KCNE1, CRYAA) in v4? We should warn users about this issue (point them to this paper) and ask them to use v2 for these three genes, since the GRCh37 reference did not have the same issue for these 3 genes.

rileyhgrant commented 11 months ago

From browser meeting, we should also show this flag/message for GRCh38 variants that are located in these 3 genes.

ch-kr commented 9 months ago

We discussed at the monthly browser meeting, and the takeaway was that we want to add warning text to both the three gene pages and also to each variant falling within these three genes.

This is the proposed text:

This gene is impacted by false duplications in the GRCh38 reference. For more details, please see this publication. We encourage our users to use frequency information from gnomAD v2.

^ where gnomAD v2 would ideally link to the gnomAD v2 page of the gene.

Could someone create a mockup with this warning flag added and share to #gnomad or #gnomad_browser to get feedback from the wider project?