Closed luigallucci closed 5 days ago
Hi, thanks for the report!
Regarding your command, it may be that -j 30
is a bit excessive; it could be that the overhead of communicating between threads is a bit high with that many threads. Maybe you should measure whether you actually get a speedup over, let’s say -j 16
.
Also, -e 0.1
is the default maximum error rate, so you could leave out that part of the command.
Regarding stats with --revcomp
, I can confirm that the report is incomplete when using --revcomp
with paired-end data. Minimal example:
cutadapt --revcomp -o 1.fastq -p 2.fastq tests/data/revcomp.1.fastq tests/data/revcomp.2.fastq
Also, when I add --json stats.cutadapt.json
, a file is produced, but it contains "reverse_complemented": null
.
I’ll look into this.
Hi, thanks for the report!
Regarding your command, it may be that
-j 30
is a bit excessive; it could be that the overhead of communicating between threads is a bit high with that many threads. Maybe you should measure whether you actually get a speedup over, let’s say-j 16
.Also,
-e 0.1
is the default maximum error rate, so you could leave out that part of the command.
Thank you for the feedback!
Regarding stats with
--revcomp
, I can confirm that the report is incomplete when using--revcomp
with paired-end data. Minimal example:cutadapt --revcomp -o 1.fastq -p 2.fastq tests/data/revcomp.1.fastq tests/data/revcomp.2.fastq
Also, when I add
--json stats.cutadapt.json
, a file is produced, but it contains"reverse_complemented": null
.I’ll look into this.
Thank you again for the work and developing effort :)
This is fixed now.
Hi @marcelm ,
i'm using the actual stable version of cutadapt on python 3.10.
My question is regarding the demultiplexing of mixed oriented reads.
My actual command looks like that. Basically, I'm able to get almost full trimming (96%) using --revcomp. Without that flag, around 45%. Anyway, I'm pretty sure that the reads are mixed oriented as the sequencing center has a long-time collaboration and they always worked in that way, producing the mixed oriented reads for metabarcoding (Illumina Next-seq, 2 x 300 bp).
First, have you any comment in general on the command? are they correct for that type of data? Second, my question regarding the use of --revcomp, is that adding this flag for me is not possible to get a normal stats output of cutadapt.
Here is everything I get:
While usually is supposed to give me also information on the different barcode, right? I'm looking forward to heard your opinion on that.
EDIT: I'm doing demultiplexing and at the same time also primers removal (so on the -G there is a single line corresponding to the reverse primer, while on the -g all the barcodes+forwardprimer)