Closed dhoogest closed 8 months ago
https://github.com/nhoffman/dada2-nf/commit/61c0772b60239100c342f2d4bca8f8506d51e4a6
I believe it was removed with the creation of the output/[R1, R2] dirs which should contain the unmerged reads
Yeah, so the 'unmerged' reads could be inferred as:
Yea or just the seqnames
cool closing
Happy to revive the get_unmerged.R script or add annotation somewhere. Whatever makes things easiest for the reporting process
I think we can add it on the 'reporting' side of the pipeline assuming we've got all of the necessary info in the outputs. At a glance, I think the seqnames might not do the trick, since each merged/R1/R2 sv list is independently enumerated (if I'm not mistaken).
I think the seqtab.csv files have the original seqnames??
Ah gotcha, so like dada2-nf/dada/{sampleid}/{orientation}/seqtab*.csv
? I was looking in dada2-nf/R1/sv_table.csv
etc.
Not sure that'll do the trick either, all I see in the seqtab headers are:
sampleid, weight, seq
Might be easiest to lean on the bin/get_unmerged.R script within this pipeline afterall...
Looks good to me (no need to change the workflow logic - nice). Tag forthcoming? /cc @nhoffman
@crosenth @nhoffman I don't recall if we'd made a specific decision to drop the use of https://github.com/nhoffman/dada2-nf/blob/master/bin/get_unmerged.R as a workflow step (possibly we just overlooked since merging wasn't a focus of ITS work?), but it's come up in the context of https://gitlab.labmed.uw.edu/molmicro/NGS16S/-/issues/342#note_113382. Is there any reason not to just add a step following
dada2_dada.R
to consume the dada.rds output and generate per-sample unmerged_F/R.fasta files?