dnanexus-archive / viral-ngs

viral-ngs
6 stars 6 forks source link

Multiple contig assembly on viral-ngs pipeline #2

Closed yifei-men closed 8 years ago

yifei-men commented 8 years ago

@alphabdiallo cc @mlin

This branch edits the existing viral-ngs pipeline to handle multi-contig genome such as that of the Lassa virus.

Tried this pipeline on the sample provided by Chris and was able to reproduce the expected assembly (modulo difference in the naming of the scaffold: local version names it >G1190-0, DNAnexus version names it G1190_scaffold-0).

Edit 8/17: Fixed scaffold naming convention

Note: the file lassa.fasta in assets is used, for the time being, as input for both the filter and scaffold steps. Eventually, we would envision providing a fasta file with a larger number of strains for the filter step.

Example usage: lassa assembly run

TODO:

Edit build workflow to generate workflow with Lassa-specific input / add validation data

yifei-men commented 8 years ago

Working now to edit workflow building and add validation of lassa.

yifei-men commented 8 years ago

Re-running Travis tests, experienced errors in tests due to slowness in platform response yesterday afternoon.