DKFZ-ODCF / AlignmentAndQCWorkflows

The DKFZ alignment workflow plugin originally developed at the eilslabs
https://github.com/DKFZ-ODCF/AlignmentAndQCWorkflows/wiki
Other
7 stars 5 forks source link

_de novo_ adapter detection #60

Closed vinjana closed 1 year ago

vinjana commented 4 years ago

Currently, adapters are only trimmed if an adapter sequence is known before. Implement de novo adapter detect with fastp.

Allow setting: (a) don't trim (b) trim with trimmomatic (should remain default when trimming -- for backwards compatibility) (c) trim with fastp without known adapter (d) trim with fastp including known adapter (e) only check whether de novo (observed) adapter is compatible with provided (expected) adapter.

By default, the quality-trimming should be similar (by cutoffs) to the quality-trimming currently done with trimmomatic.

The implementation should allow for flexible configuration of fastp, in case any of the other trimming options are required.


Implementation details.

fastp \
  --thread 12 \
  --report_title $title"_"$id \
  -i $r1 \
  -I $r2 \
  -o $OUTDIR"/"$title"_"$id"_"$(baseOutput "$r1")"_R1.fastq" \
  -O $OUTDIR"/"$title"_"$id"_"$(baseOutput "$r1")"_R2.fastq"  \
  -Q \
  -l 8 \
  --detect_adapter_for_pe \
  --json $title"_"$id"_"$(baseOutput "$r1")"_fastp.json" \
  --html $title"_"$id"_"$(baseOutput "$r1")"_fastp.html" \
  2> $title"_"$id"_"$(baseOutput "$r1")"_fastp_out.log" 

This call needs to be extended to additionally do quality-based trimming.

vinjana commented 1 year ago

De novo adapter detection should be done outside the AQCWF.