populationgenomics / automated-interpretation-pipeline

Rare Disease variant prioritisation MVP
MIT License
5 stars 4 forks source link

Test data model #279

Closed MattWellie closed 1 year ago

MattWellie commented 1 year ago

Fixes

Proposed Changes

Intention:

"I want to create a variant which AIP should classify as Cat3 & Cat5, which will satisfy a Biallelic MOI"

- Create a variant with high SpliceAI scores
- Create one or more transcript consequences which satisfy Cat3 (missense+, lof:HC or Clinvar+), and have a geneID assc. With a biallelic MOI
- Create as many samples as you like, each with a corresponding genotype
- Write out as a MT
- Write out as a VCF

Corner case:

All of this should be doable in ~20 lines of code, using model_go_brrrr.py as a template.

Voila! You have some appropriately formatted data for running through the whole process 🧙

nb. doesn't currently do the pedigree, or panelapp data.

Checklist

MattWellie commented 1 year ago

The three obvious(?) ways to generate a full MT from the annotation state:

MattWellie commented 1 year ago

https://hail.is/docs/0.2/hail.MatrixTable.html#hail.MatrixTable.from_parts may not actually exist in the codebase..

MattWellie commented 1 year ago

https://github.com/hail-is/hail/blob/9f7ede5e455af7111101b55923a2601361c434f5/hail/python/hail/table.py#L3295

This looks much more promising