This branch edits the existing viral-ngs pipeline to handle multi-contig genome such as that of the Lassa virus.
Tried this pipeline on the sample provided by Chris and was able to reproduce the expected assembly (modulo difference in the naming of the scaffold: local version names it >G1190-0, DNAnexus version names it G1190_scaffold-0).
Edit 8/17: Fixed scaffold naming convention
Note: the file lassa.fasta in assets is used, for the time being, as input for both the filter and scaffold steps. Eventually, we would envision providing a fasta file with a larger number of strains for the filter step.
@alphabdiallo cc @mlin
This branch edits the existing viral-ngs pipeline to handle multi-contig genome such as that of the Lassa virus.
Tried this pipeline on the sample provided by Chris and was able to reproduce the expected assembly
(modulo difference in the naming of the scaffold: local version names it >G1190-0, DNAnexus version names it G1190_scaffold-0).Edit 8/17: Fixed scaffold naming convention
Note: the file
lassa.fasta
in assets is used, for the time being, as input for both thefilter
andscaffold
steps. Eventually, we would envision providing a fasta file with a larger number of strains for thefilter
step.Example usage: lassa assembly run
TODO:
Edit build workflow to generate workflow with Lassa-specific input / add validation data