nhoffman / dada2-nf

A Nextflow pipeline for processing 16S rRNA sequences using dada2
0 stars 2 forks source link

Column name difference in counts.csv following upgrade in dada2 R dependencies #40

Closed dhoogest closed 2 years ago

dhoogest commented 2 years ago

Previous behavior during the ljoin.R script in join_counts pipeline step gave headers in counts.csv as shown below (note the 16S_f/r headers):

(ngs16s_dada2nf-env) dhoogest@gattaca:/molmicro/working/dhoogest/src/ngs16s_dada2nf$ nextflow run https://github.com/nhoffman/dada2-nf -r 4cfe35ab28bd1cbcb68275e5929f3473367602a4 -params-file dada2-nf/test-params.json -profile singularity
N E X T F L O W  ~  version 21.10.6
Launching `nhoffman/dada2-nf` [modest_bose] - revision: 4cfe35ab28bd1cbcb68275e5929f3473367602a4
[05/7b8680] process > copy_filelist         [100%] 1 of 1, cached: 1 ✔
[a7/6f0fde] process > read_manifest (1)     [100%] 1 of 1, cached: 1 ✔
[bd/2f1ef0] process > plot_quality (1)      [100%] 6 of 6, cached: 6 ✔
[d8/22f106] process > barcodecop_single (1) [100%] 6 of 6, cached: 6 ✔
[ab/39a8f3] process > bcop_counts_concat    [100%] 1 of 1, cached: 1 ✔
[29/bc087b] process > filter_and_trim (4)   [100%] 6 of 6, cached: 6 ✔
[5c/d0e6e4] process > learn_errors (1)      [100%] 3 of 3, cached: 3 ✔
[2e/a16a2a] process > dada_dereplicate (4)  [100%] 6 of 6, cached: 6 ✔
[66/3b410f] process > combined_overlaps     [100%] 1 of 1, cached: 1 ✔
[cd/0df940] process > dada_counts_concat    [100%] 1 of 1, cached: 1 ✔
[65/41a8d8] process > write_seqs            [100%] 1 of 1, cached: 1 ✔
[3a/ff8782] process > cmsearch              [100%] 1 of 1, cached: 1 ✔
[4d/1057ca] process > filter_svs            [100%] 1 of 1, cached: 1 ✔
[92/0e2b9b] process > join_counts           [100%] 1 of 1, cached: 1 ✔
[4d/704adf] process > save_params           [100%] 1 of 1, cached: 1 ✔

(ngs16s_dada2nf-env) dhoogest@gattaca:/molmicro/working/dhoogest/src/ngs16s_dada2nf$ head -n 1 output/dada2-nf/counts.csv
"sampleid","raw","barcodecop","filtered_and_trimmed","denoised_r1","denoised_r2","merged","no_chimeras","16s_f","16s_r","not_16s"

In current master (1.15) the 16S cols now are prefaced with an 'X':

(ngs16s_dada2nf-env) dhoogest@gattaca:/molmicro/working/dhoogest/src/ngs16s_dada2nf$ nextflow run https://github.com/nhoffman/dada2-nf -r master -params-file dada2-nf/test-params.json -profile singularity
N E X T F L O W  ~  version 21.10.6
Launching `nhoffman/dada2-nf` [ridiculous_baekeland] - revision: 29829effcf [master]
[9a/50c19e] process > copy_filelist         [100%] 1 of 1, cached: 1 ✔
[15/3c1889] process > read_manifest (1)     [100%] 1 of 1, cached: 1 ✔
[93/715ca0] process > plot_quality (1)      [100%] 6 of 6, cached: 6 ✔
[fa/b6d8bd] process > barcodecop_single (4) [100%] 6 of 6, cached: 6 ✔
[a9/a5d7b5] process > bcop_counts_concat    [100%] 1 of 1, cached: 1 ✔
[eb/68cee5] process > filter_and_trim (4)   [100%] 6 of 6, cached: 6 ✔
[69/63ce01] process > learn_errors (2)      [100%] 3 of 3, cached: 3 ✔
[b3/d3ca27] process > dada_dereplicate (4)  [100%] 6 of 6, cached: 6 ✔
[92/dfd9d2] process > dada_get_unmerged (3) [100%] 6 of 6, cached: 6 ✔
[47/b9f423] process > combined_overlaps     [100%] 1 of 1, cached: 1 ✔
[9d/80b07c] process > dada_counts_concat    [100%] 1 of 1, cached: 1 ✔
[27/ba3e0f] process > write_seqs            [100%] 1 of 1, cached: 1 ✔
[e4/18485f] process > cmsearch              [100%] 1 of 1, cached: 1 ✔
[b1/32a36b] process > filter_svs            [100%] 1 of 1, cached: 1 ✔
[ba/043c44] process > join_counts           [100%] 1 of 1, cached: 1 ✔
[54/0d9ba6] process > save_params           [100%] 1 of 1, cached: 1 ✔

(ngs16s_dada2nf-env) dhoogest@gattaca:/molmicro/working/dhoogest/src/ngs16s_dada2nf$ head -n 1 output/dada2-nf/counts.csv
"sampleid","raw","barcodecop","filtered_and_trimmed","denoised_r1","denoised_r2","merged","no_chimeras","X16s_f","X16s_r","not_16s"

There are a few ways around this, but perhaps simplest would be to set check_names=FALSE in the read.csv command?

/cc @mwohl

dhoogest commented 2 years ago

Yep, this works: https://github.com/nhoffman/dada2-nf/pull/41