broadinstitute / seqr-loading-pipelines

hail-based pipelines for annotating variant callsets and exporting them to elasticsearch
MIT License
22 stars 20 forks source link

Investigate support for running SpliceAi #786

Open lynnpais opened 2 months ago

lynnpais commented 2 months ago

Package available to run with tensor flow.

bpblanken commented 2 months ago

Some early notes:

Was able to get the command line tool running:

pip install spliceai tensorflow
cat v03_pipeline/var/test/callsets/1kg_30variants.vcf| spliceai -R vep_data/hg19.fa -A grch37

We have a couple of options:

1) Try to hack spliceai into hail's VEP call (which has a hail table -> stdout -> command execution -> hail table) setup (the least work but the most brittle). 2) Do something similar to what we've done with the clingen allele registry and manage the hail export, shell exec, vcf parse, and hail import ourselves.

Regardless, we should read more in depth/have a convo with BenW about the bug fixes and changes he's made to the spliceai source on his fork.