Closed kalavattam closed 1 year ago
Thank you for your interest in Atria.
The *.log.json
file has the counts of "good-read-pairs" and "total-read-pairs".
Currently, Atria does not output detailed stats summary. However, --stats
outputs some metrics for each read in the description lines in Read2 outputs (the third line starts with +
of fastq file):
Each cell is delimited by '\t'
Res
$r12_trim # is adapter trimmed (true/false)
$(length(r1.seq)) # length of r1 after adapter trimming. If the length is different from the output, quality trimming is performed.
$(length(r2.seq)) # length of r2 after adapter trimming. If the length is different from the output, quality trimming is performed.
|R1 # the following stats are for development only.
$r1_insert_size
$r1_adapter_score
$r1_insert_size_pe
$r1_pe_score
|R2
$r2_insert_size
$r2_adapter_score
$r2_insert_size_pe
$r2_pe_score
|prob
$r1_adapter_prob
$r2_adapter_prob
$r1_pe_prob
$r2_pe_prob
$r1_head_prob
$r2_head_prob
A small script would be useful to process the data.
Another option is to use fastqc and multiqc to analyze the raw and trimmed fastqs and find the difference.
Thank you for the quick response. Following your advice and suggestions, I can take some measurements of the adapter and quality processing. Thanks for making and maintaining this great tool. Will close the issue now.
Hi, thank you for the very useful and fast-performing tool. I am running it now and examining the output; I am confused as to where I can find metrics on the trimming and processing of the reads—for example, the numbers/percentages of reads trimmed, etc. This information is not in the
*.log
and*.log.json
files. I am running the tool with non-simulated, "real" fastq files from different NGS experiments.I invoke
atria
like this:However, do I need to include the argument
--stats
to see this information? For example,The documentation for
--stats
is confusing:Reading this, it's not clear to me that
--stats
will give me metrics regarding the numbers/percentages of reads subjected to trimming, quality processing, etc.In the program, I see utilities for benchmarking the tool with simulated reads, but I need metrics for what the tool is doing to my real data.
Thanks,
Kris