chanzuckerberg / shasta

[MOVED] Moved to paoloshasta/shasta. De novo assembly from Oxford Nanopore reads
Other
272 stars 59 forks source link

Limited automation of Align parameters #188

Closed rlorigro closed 4 years ago

rlorigro commented 4 years ago

This adds a simple method for evaluating the alignment drift, skip, trim, alignedFraction, and markerCount during run time, and choosing cutoffs based on a percentile. Testing on 3 basecallers and 1 ultralong dataset confirms that performance in terms of continuity and total assembled length is equal to or better than manually selected parameters.

For future development, more CSV output files are created when running with ReadGraph.creationMethod 2. This includes the distribution of all (5) parameters that are currently automated, as well as a list of alignments and their individual stats regarding these parameters.

maxTrim is currently still being evaluated, so perhaps it is best to wait on merging until those results are in.

2 new config files are added, which enable automation:

  1. Nanopore-Jun2020-Automation.conf
  2. Nanopore-UL-Jun2020-Automation.conf

The major changes to these config files are: