jpuritz / dDocent

a bash pipeline for RAD sequencing
ddocent.com
MIT License
52 stars 41 forks source link

Trimmomatic SE with PE adapters #23

Closed cfriedline closed 5 years ago

cfriedline commented 8 years ago

This might be something that isn't easily generalizable across all users, but ran across this today digging through the dDocent wrapper.

ADAPTERS appear to be set here:

if find ${PATH//:/ } -maxdepth 1 -name TruSeq2-PE.fa 2> /dev/null | grep -q 'Tru' ; then
    ADAPTERS=$(find ${PATH//:/ } -maxdepth 1 -name TruSeq2-PE.fa 2> /dev/null | head -1)
        ...

Then used here:

#Function for trimming reads using trimmomatic
TrimReads () { 
...
for i in "${NAMES[@]}"
do
#echo "Trimming Sample $i"
if [ -f $i.R.fq.gz ]; then
java -jar $TRIMMOMATIC PE -threads $NUMProc -phred33 $i.F.fq.gz $i.R.fq.gz $i.R1.fq.gz $i.unpairedF.fq.gz $i.R2.fq.gz $i.unpairedR.fq.gz ILLUMINACLIP:$ADAPTERS:2:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:5:10 $TW &> $i.trim.log
else java -jar $TRIMMOMATIC SE -threads $NUMProc -phred33 $i.F.fq.gz $i.R1.fq.gz ILLUMINACLIP:$ADAPTERS:2:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:5:10 $TW &> $i.trim.log
...
}

Shouldn't this be the SE adapters for SE ddRAD, or at least some choice between which adapter set to use TruSeq2 vs 3, for example? Maybe this is more of a feature request than an issue, but wanted to get it logged. ;-)

This is dDocent 2.24 I'm happy to come up with something, if you want a PR. Just let me know.

Thanks!

jpuritz commented 8 years ago

The PE adapter list contains the SE adapters, so that doesn't really matter. If people want to add custom adapters, I think I will leave that to editing the code directly. I'm in the process of reworking the documentation and I can add this in there.

cfriedline commented 8 years ago

Not all of them, though...

$cat TruSeq2-SE.fa
>TruSeq2_SE
AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG
>TruSeq2_PE_f
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
>TruSeq2_PE_r
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
$ grep AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG Tru*.fa
TruSeq2-SE.fa:AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG

$ grep AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT Tru*.fa
TruSeq2-PE.fa:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
TruSeq2-SE.fa:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
TruSeq3-PE-2.fa:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
TruSeq3-SE.fa:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA

$ grep AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG Tru*.fa
TruSeq2-PE.fa:AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG
TruSeq2-SE.fa:AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
jpuritz commented 5 years ago

I think I can finally close this. Version 2.7 and up will use fastp for adapter trimming and will implement adaptive, auto-detection for adapters. This should fix this issue permanently.