varfish-org / mehari

VEP-like tool for sequence ontology and HGVS annotation of VCF files
MIT License
16 stars 1 forks source link

Ensure ManePlusClinical is properly recognized/annotated #301

Open holtgrewe opened 8 months ago

holtgrewe commented 8 months ago

Describe the bug I have the impression that the MANE Plus Clinical Transcripts are not properly recognized/built for mehari-data-tx. Review for GRCh37 and GRCh38.

We should probably write out some summary statistics how many MANE Select / Plus Clinical Transcripts have been written out for easier introspection.

Additional context N/A

holtgrewe commented 6 months ago

Plus Clinical is missing even in mehari builds for GRCh38.

holtgrewe commented 6 months ago

It turns out that this is missing from the cdot-0.2.22.GCF_000001405.40_GRCh38.p14_genomic.110.gff.json.gz files in CDOT.

It is included in cdot-0.2.22.ensembl.Homo_sapiens.GRCh38.110.gff3.json.gz and cdot-0.2.22.refseq.grch38.json.gz.

cf. https://github.com/SACGF/cdot/issues/71

Let's monitor whether we get an answer from upstream.

holtgrewe commented 6 months ago

Re-assigning priority and size. When fixed upstream, this can be a simple fix by creating a new release based on an updated CDOT release.