broadinstitute / gnomad-browser

Explore gnomAD datasets on the web
https://gnomad.broadinstitute.org
MIT License
81 stars 41 forks source link

Display AN for non-UKB exome subset #1598

Open ch-kr opened 1 month ago

ch-kr commented 1 month ago

A user wrote in asking about the allele numbers (AN) for this variant. I realized from their question that the non-UKB ANs are not being displayed when that dataset is selected: Full gnomAD v4.1 dataset:

image

Non-UKB subset of gnomAD v4.1:

image

The AN of the full v4.1 exomes is 1458958, so >729k samples have defined ANs at this site. Checking our all sites AN downloadable HT confirms that this variant has a defined non-UKB AN:

ht = hl.read_table('gs://gcp-public-data--gnomad/release/4.1/ht/exomes/gnomad.exomes.v4.1.allele_number_all_sites.ht')
ht = hl.filter_intervals(ht, [hl.parse_locus_interval>>> ht = hl.filter_intervals(ht, [hl.parse_locus_interval('chr3:10052405-10052406', reference_genome='GRCh38')])
# 167 is `{'group': 'adj', 'subset': 'non_ukb'}` per `ht.strata_meta`
ht = ht.annotate(adj=ht.AN[0], non_ukb=ht.AN[167])
ht.show()
image

Would it be possible to update the browser to display the non-UKB ANs when they exist?

ch-kr commented 1 month ago

adding a comment to say I think this display issue may only impact variants that are flagged or fail variant QC. Variants that aren't filtered seem to be adjusting AN based on the subset selected (e.g., https://gnomad.broadinstitute.org/variant/11-747571-C-T?dataset=gnomad_r4): image image