populationgenomics / automated-interpretation-pipeline

Rare Disease variant prioritisation MVP
MIT License
5 stars 4 forks source link

Retrospective Reanalysis Framework #302

Closed MattWellie closed 9 months ago

MattWellie commented 11 months ago

Run AIP repeatedly in a way which simulates incremental analyses over time.

Start: 01-01-2020 Increments: 3 Months

Static: Joint Call, Consequence Annotation, AIP version, Cohort Participants, HPO Metadata, MOI-per-gene (latest data) Changing: Gene List (reflect the PanelApp state at each time point, see this), ClinVar data (reflect the latest ClinVar submissions up to each time point, see this)

This will run only on the Acute-Care dataset.

Exome & Genome 'Types' run in parallel.

Familial and Singleton report generated for each Type & time point (4 reports per time point)

No hard filtering of previously seen variants, data will be presented in the report with dates, for interactive filtering.


N.B. these ideas were considered, but ultimately were not actioned.

MattWellie commented 11 months ago

For context relating to the ClinVar Gold Stars:

Stars All Subs Pathogenic only
Any 1.03M 186k
1 1.02M 178k
3 12570 8673
4 13 13

These results relate to the privately re-summarised data, excluding all submissions from MCRI/VCGS

MattWellie commented 9 months ago

Index file linking to the reports created based on this battle plan https://test-web.populationgenomics.org.au/acute-care/retro_analysis/retro_index.html