SV pathogenicity predictors

Here are 4 SV pathogenicity predictors that we decided would be useful:

AnnotSV - https://lbgi.fr/AnnotSV/ - annotates SVs with presence in DGV, DD, etc. Pathogenicity ranking useful for making sure you didn't miss an obvious pathogenic SV. Ranks SVs by whether they overlap with pathogenic SVs, haploinsufficient genes, known OMIM genes OR whether the SV overlaps significantly with a known benign SV
CADD-SV - https://cadd-sv.bihealth.org/ Model incorporates many annotations including SNV scores as well as constraint, conservation, epigenetic/regulatory information, and gene overlap to score SVs. Trains on chimp and human SVs, which is different than other predictors.
TADA - https://github.com/jakob-he/TADA/ Focused on incorporating TAD-centric annotations to identify those that might be pathogenic due to effects of genome structure, however there's little relevant training data, so their score is mostly driven by coding annotations.
StrVCTVRE (you already have implemented)

There is effectively 0 overlap between the pathogenicity predictions by each of these tools when we did our own analysis using recently-reported SVs. We think given how little is really known about pathogenic SVs, it's worth implementing all of these predictors and looking at SVs that are flagged by at least one tool (or at least 2 tools, etc).

broadinstitute / seqr

SV pathogenicity predictors #2730