benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
460 stars 141 forks source link

quality filter: imbalance forward vs reverse reads - dada2- #1948

Closed valengirardi closed 1 month ago

valengirardi commented 1 month ago

Hello! I'm performing quality filtering on my sequences due to low quality in the reverse reads. However, I've encountered an imbalance where I have fewer reverse reads compared to forward reads, making it impossible to proceed with DADA2. Can you advise me on how to address this issue?

this is the filtering code: for file in "${INPUT_DIR}"/*.fastq; do base_name=$(basename "${file}" .fastq) bbduk.sh in="${file}" out="${OUTPUT_DIR}/${base_name}.fastq" qtrim=rl trimq=10 echo "Archivo ${file} procesado." done

Additionally, I've attempted to apply separate quality filters for the forward and reverse reads, but I'm unable to achieve a balanced number of sequences in both:

for file in "${INPUT_DIR}"/.fastq; do base_name=$(basename "${file}" .fastq) if [[ $base_name == "_2"*.fastq ]]; then bbduk.sh in="${file}" out="${OUTPUT_DIR}/${base_name}.fastq" qtrim=rl trimq=10 echo "Archivo ${file} procesado." else bbduk.sh in="${file}" out="${OUTPUT_DIR}/${base_name}.fastq" qtrim=rl trimq=15 echo "Archivo ${file} procesado." fi done

benjjneb commented 1 month ago

I'd recommend using filterAndTrim to jointly filter your forward and reverse reads (i.e. each read pair is evaluated, and then filter out or kept). This will maintain the matching between forward and reverse reads that is lost when you separately filter those files. The dada2 tutorial shows an example of this for Illumina paired end data.

valengirardi commented 1 month ago

Hi! thank u for your response. I usually run this kind of things on terminal, not on R. How can I adapt filterandtrim in this situation? I tried with bbduk, cutadapt and Trimmomatic

benjjneb commented 1 month ago

The dada2 tutorial gives a worked example of inspecting the quality profile and then running filterAndTrim on paired-end Illumina reads: https://benjjneb.github.io/dada2/tutorial.html

If you are using dada2 later, then just run the filterAndTrim part in the same way you would have run the later parts of the workflow.