benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
469 stars 142 forks source link

Cant run the filtering and trimming step #247

Closed SaiPrabhas closed 7 years ago

SaiPrabhas commented 7 years ago

I want to try DADA2 pipeline for the data analysis, but I got the following error

"Error in mcmapply(fastqPairedFilter, mapply(c, fwd, rev, SIMPLIFY = FALSE), : 'mc.cores' > 1 is not supported on Windows"

Can anyone please help in resolving this issue. Thanks in advance.

benjjneb commented 7 years ago

Thanks for the report. Can you help us pinpoint the issue by providing the exact command you ran to get this error? And also the output from packageVersion("dada2") and R.version?

SaiPrabhas commented 7 years ago

out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen=c(120,150), maxN=0, maxEE=c(5,2), truncQ=2, rm.phix=TRUE, compress=TRUE, multithread=TRUE).

The code did execute when I changed multithread=FALSE, packageversion("dada2") was '1.5.0'. The R version is 3.4.0. However I am getting a new warning message when I ran errF <- learnErrors(filtFs, multithread=TRUE).

It says In dada(drps, err = NULL, selfconsist = TRUE, multithread = multithread) : Self-consistency loop terminated before convergence. It ran till selfConsist step 10. Will this affect my analysis? Thanks in advance

benjjneb commented 7 years ago

That warning means that the error model didn't completely converge prior to the loop terminating. In every case I've ever seen you can safely ignore the warning -- even though it didn't completely converge the error model should be close enough to convergence to be very accurate.

If you want to be completely sure, you can inspect the output of:

foo <- learnErrors(dadaFs) # Replace with the exact command that gave this warning
dada2:::checkConvergence(foo)

That will give you the sequence of numbers, the sum of the absolute differences between the matrix of estimated error rates after each self-consistency step.

The final numbers should be much much (several orders of magnitude) lower than the first numbers. As long as that is the case, just go ahead and use the results and ignore the warning.

fanli-gcb commented 7 years ago

This blog post might help with your original problem.

SaiPrabhas commented 7 years ago

Hello I have walked through all the steps in the tutorial and created a sequence table, but in the end when I tried to save the file, I got the following error.

saveRDS(seqtab, "C:/Users/Bioinfo/Desktop/Dada2") Error in gzfile(file, mode) : cannot open the connection In addition: Warning message: In gzfile(file, mode) : cannot open compressed file 'C:/Users/Bioinfo/Desktop/Dada2', probable reason 'Permission denied'

I have checked the permissions of the folder and and everything looks ok. It is giving the same error even if I tried to save it in a different location. Please help me in this regard. Thanks in advance.

benjjneb commented 7 years ago

Fix by specifying the file variable name: saveRDS(seqtab, file="C:/Users/Bioinfo/Desktop/seqtab.rds")

Easy to forget, I did on another issue!

Edit: Oh, and you should specify the filename as something like seqtab.rds, rather than just Dada2

SaiPrabhas commented 7 years ago

I have tried that just now but it is still returning the same error.

saveRDS(seqtab, file = "C:/Users/Bioinfo/Desktop/Dada2") Error in gzfile(file, mode) : cannot open the connection In addition: Warning message: In gzfile(file, mode) : cannot open compressed file 'C:/Users/Bioinfo/Desktop/Dada2', probable reason 'Permission denied'

benjjneb commented 7 years ago

Well it could also be a permissions issue. Do you have permissions to write to the desktop? You may want to try another folder that you know has write permissions.

SaiPrabhas commented 7 years ago

I have checked the folder and it shows that the user has full control. To be sure, I checked with multiple other folders but every time it is returning the same error. permissions

fanli-gcb commented 7 years ago

Do forward/reverse slashes matter on Windows?

benjjneb commented 7 years ago

Are you saving to the directory name alone?

If so, ou need to specify the full filename, eg. file="path/to/DADA2/output.rds"

Edit: Also capitalization counts on Windows I think.

SaiPrabhas commented 7 years ago

Thank you sir that worked but I have another question. The file did get saved but when I opened it I could not make any sense out of it. Is there any specific program to open the rds file? sorry for the trivial questions but I am a masters student and I am trying to standardize DADA2 for our lab. Thank you for your patience. taxfinal

fanli-gcb commented 7 years ago

https://stat.ethz.ch/R-manual/R-devel/library/base/html/readRDS.html

The files saved/loaded by the saveRDS and readRDS functions are not meant to be human-readable.

SaiPrabhas commented 7 years ago

So how can I look at the taxonomic assignment of the sequence table?

SaiPrabhas commented 7 years ago

Is there a way in which I can look at the list of microbial community identified by DADA2?

benjjneb commented 7 years ago

The sequence table does not have taxonomy information in it, it has the exact sequences instead.

Did you use assignTaxonomy on the sequence table?

If so, you can inspect that object in R:

tax <- assignTaxonomy(seqtab, "path/to/training-fasta.fa.gz", multithread=TRUE)
unname(tax)
unname(head(tax))

R is an interactive analysis platform, and you can do all the inspecting and analysis of DADA2's results within R by using basic R commands and installable packages.

If you want to look at the data in other programs, you need to identify what kind of file format they require. It is relatively straightforward to save data in formats readable by spreadsheet software, or by other packages that handle biom files, but you need to know what exactly it is you want.

SaiPrabhas commented 7 years ago

Yes I did use the assignTaxonomy and I did get the list of bacterial species. My major concern was once I leave the R program and want to continue my analysis from the sequence table and want to see the taxonomic assignments, how can I do it?

benjjneb commented 7 years ago

It's difficult to help you directly here because its not clear what you are trying to do. If you just want to save everything, so you can come back to exactly what you have later, I recommend the following:

save.image(file="path/to/my_dada2_analysis.Rdata")
### Leave R, Restart R later
load(file="path/to/my_dada2_analysis.Rdata")

When you restart R, and load the data back in, you'll be right back where you are now, with all the results of your dada2-processing intact.