benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
469 stars 142 forks source link

dada2:::removePrimers function does not work for lists of files. Error output #580

Closed pneumowidow closed 6 years ago

pneumowidow commented 6 years ago

Hello again,

I've been trying to reduce the computing time for my DADA2 + PACBIO analyses by running my commands for all fastq files at once. I followed the step by step tutorial on the DADA2 + PacBio: ZymoBIOMICS Microbial Community Standard html page. However, instead of supplying one fastq file at a time, I decided to create a list of my files and sync it to the path. Below are screen shots of my commands, print output of my input reads (fn) and the output reads (nop) location which syncs with the input:

image

image

Unfortunately, I keep getting the error message (below). I already opened your master file to find this error message in the coding, but cannot find it. I assume that there's no way, this pipeline only works for one file at a time because I have 120 fastq files and I wouldn't want to write a dada2:::removeprimers command for each of them. Can you help me please? Maybe there's something I'm missing in my command.

image

Many thanks!!!

benjjneb commented 6 years ago

Yes, what you are running into is that we haven't "vectorized" this command yet (in the R sense) so it only works on one file at a time.

There's an easy fix though, just wrap it in a loop:

for(i in seq_along(fn)) {
   dada2:::removePrimers(fn[[i]], nop[[i]], ...)
}

Edit: Adding an enhancement label because we should vectorize this command like filterAndTrim.

pneumowidow commented 6 years ago

Thank you for the quick response. Yes, I assumed a for loop would work, but thought you guys already "vectorized" it as you say and didn't want to go through the bother of doing it. Anyway, the loop works and I appreciate it!