biocore / deblur

Deblur is a greedy deconvolution algorithm based on known read error profiles.
BSD 3-Clause "New" or "Revised" License
92 stars 41 forks source link

Is there an equivalent deblur-stats output in the stand alone version? #204

Closed AnalissaFSarno closed 4 years ago

AnalissaFSarno commented 4 years ago

Hi,

I'm new to deblur and am trying to troubleshoot a few aspects from my 2x300 MiSeq 16S (515F,909R) data set. I saw on the Qiime2 forum that there is a deblur-stats output file that allows you to track your reads through the pipeline more comprehensively. (https://forum.qiime2.org/t/reopened-explanation-of-deblur-stats-output/11141)

I was wondering if there is an equivalent stats output file for the stand alone version? I am currently using the stand alone version because my input file is a demultiplexed fasta file, which to my understanding is not supported in Q2 deblur plugin.

Thank you for any insight, Analissa

wasade commented 4 years ago

Hi Analissa,

There is not a standalone version of the stats. I'm not aware of a reason why q2-deblur would not support the use case described -- I recommend searching the Q2 forum, and opening a thread there if needed, if you're having an issue.

Best, Daniel

amnona commented 4 years ago

Hi Analissa, in addition to what Daniel wrote, you can see the stats for each sample processed in standalone deblur in the deblur.log outout file (specified by the --log-file parameter). It contains the output summary of each stage of deblur for each sample. You can get even more info there by setting the log level (using the --log-level parameter) to 1.

Amnon

AnalissaFSarno commented 4 years ago

Hi Ammon,

This is very helpful, thank you. Is there some way to convert the deblur.log file into an easily readable table separated by sample? Maybe there is an option parameter for the log file that I am missing.

I also wanted to confirm my understanding was correct, that currently the Q2 plugin only accepts forward reads in Fastq file format? I am asking because I have paired end reads in fasta format.

Thank you for your time, Analissa

On Tue, Oct 13, 2020 at 2:46 AM amnona notifications@github.com wrote:

Hi Analissa, in addition to what Daniel wrote, you can see the stats for each sample processed in standalone deblur in the deblur.log outout file (specified by the --log-file parameter). It contains the output summary of each stage of deblur for each sample. You can get even more info there by setting the log level (using the --log-level parameter) to 1.

Amnon

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/204#issuecomment-707591686, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARHNTVFHMTIYISXK7KT4ONTSKQHV3ANCNFSM4SNNQ7CQ .

amnona commented 4 years ago

Hi Analissa, there is currently no script to convert the log file to a table. But is sounds like a good idea to have such a script :) (currently i'm overloaded with other projects, but if you are interested to attempt to write such a script, will be happy to help). As a quick hack, you can always grep for your sample of interest and then grep for the jobid assigned to it and see all the relevant output for this sample.

Both the Q2 and the standalone deblur versions are best used on forward reads (since there is much lower error probability on the forward reads compared to the reverse reads). Therefore i would recommend using only the forward reads. If you need the reverse reads as well, you can either merge the reads and run deblur (but note all merged reads will be trimmed to same length, so could be problematic), or alternatively just concatenate each forward and corresponding reverse read (without merging - i.e. if each read is 150bp, you get 150+150=300bp concatenated reads), run deblur on the concatenated reads, and then split back to the 2 reads and merge. But again, i recommend using on the forward reads, as the price you pay for the reverse reads increased error rate is usually not worth the additional phylogenetic resolution you gain.

Hope this helps, Amnon