Closed tomkinsc closed 8 years ago
@tomkinsc
Hey Chris,
I did an overhaul of our pipeline to the v1.10.1 release of upstream and moved our execution to use the easy-deploy
script (although I think the particular deployment strategy may already been slightly outdated by now)... It's in the misnamed v1.8.0 changes branch
We noticed that some interim figure of merits that we QC on in our CI tests (subsampled_read_counts
in trinity
and alignment_base_count
in the final analysis
for e.g.) have drifted. Good thing though is that the final assembly of the Ebola CI sample stays the same.
Just wanted to check in and make sure that these drifts we're seeing are expected:
Nice!
Glad to see the metrics caught that change, but in this case it is expected. We overhauled the step that prepares input for de novo assembly to subsample reads in bam space. Beyond being faster, it made it possible to also reconsider how we were treating extra singleton reads after de-duplicating reads and trimming adapters and low-quality bases. Looks like mean coverage depth increases in the v1.10.1, which is nice to see š. The metrics returned by the pre-Trinity subsampling are a little different, and they're described here. The read count parameter given to the Trinity now specifies the number of individual reads to use rather than pairs (though in reaching the threshold it includes paired reads first, and then fills in with singletons).
Cool š š Thanks Chris for the explanation!
I'll go ahead and edit our expected metrics then!
Closed via #39
It would be great if consensus-coverage plots were part of the assembly analysis output. For workflows using the viral-ngs v1.8.0 tarball, this should be a matter of adding a call to
reports.py plot_coverage
to the analysis applet to generate a coverage plot from the*.mapped.bam
file. Forplot_coverage
, the plot format (pdf, png, svg, etc.) is inferred from the file extension of the plot output file (the second positional argument for the command), but it can be given explicitly via the--plotFormat
parameter. Iād suggest generating${name}.coverage_plot.pdf
, with a few extra arguments to create a letter-page size plot:reports.py plot_coverage ${name}.mapped.bam ${name}.coverage_plot.pdf --plotFormat pdf --plotWidth 1100 --plotHeight 850 --plotDPI 100