populationgenomics / variant-curation-portal

Web application for curating loss of function variants
MIT License
1 stars 0 forks source link

Add hg38 locus and alleles fields from gnomAD liftover tables #43

Closed EddieLF closed 1 year ago

EddieLF commented 1 year ago

This change loads the hg37 -> hg38 liftover tables and matches the rows of the input v2 variant tables with the rows of the liftover tables to add in the hg38 fields. These two new fields hg38_locus and hg38_alleles could be compressed into one field like the existing variant_id and liftover_variant_id fields.

I've chosen to define two intermediate tables, one for exomes and one for genomes, then recombine these with the or_else() Hail function acting like a coalesce to eliminate the nulls across the two intermediate tables. There might be an easier method but I don't see any other obvious way to pull in the same fields from both the exome and genome tables.

If this gets merged then the expected format for the input into the portal will need to be updated to handle the new fields.

EddieLF commented 1 year ago

@daniaki fixed this up so that it can work with v2 or v3 gnomAD variants. Also consolidated the liftover locus and alleles fields into the single liftover_variant_id so the output should be immediately compatible with the portal. Cheers