the annotation pipeline is now using checkpoint files to modify the annotation.parquet file instead of creating new files for each rule, saving around 25% of disk space
the annotation pieline is now more modular, deepSEA and absplice scores are now optional (default) so a minimal run can be performed using only vep and deepRipe annotations
a complete test run is performed on the minimal annotation pipeline in github actions
option to run vep over the online interface without cache
Testing
The minimal version of the pipeline is tested automatically on github actions now
to test it locally, you can clone the repo and run the pipeline on example data
Test scenarios
so far tested:
complete pipeline using all annotations locally on example data
pipeline without absplice locally on example data
pipeline without deepSEA locally on example data
pipeline without deepSEA and abslpice (minimal) locally on example data
What
The changes in this PR are manyfold:
annotation.parquet
file instead of creating new files for each rule, saving around 25% of disk spaceTesting
The minimal version of the pipeline is tested automatically on github actions now to test it locally, you can clone the repo and run the pipeline on example data
Test scenarios
so far tested: