GenomiqueENS / toulligQC

A post sequencing QC tool for Oxford Nanopore sequencers
Other
87 stars 8 forks source link

dorado barcoding output #25

Closed ekaj10 closed 5 months ago

ekaj10 commented 5 months ago

Hi,

Thanks for this handy tool. I am new to using dorado for basecalling/demultiplexing and currently trying your tool for post-sequencing metrics. I presume that dorado does not output the telemetry file and used ToulligQC with the pod5.

toulligqc --report-name summary \ --sequencing-summary-source /path/to/sequencing_summary.txt \ --pod5-source /path/to/pod5 --barcoding --barcodes barcode03,barcode05 \ --qscore-threshold 10 \ --html-report-path /path/to/report.html

However I could not seem to have the barcode metrics included in the output. For context, I used dorado this way: dorado basecaller $model /path/to/pod5/ --kit-name SQK-RPB114-24 > /path/to/bam/basecalled.bam dorado summary /path/to/bam/basecalled.bam > /path/to/sequencing_summary.txt dorado demux --no-classify --emit-fastq --emit-summary --output-dir /path/to/output/ /path/to/bam/basecalled.bam

I also did not see any errors during stdout hence I can't troubleshoot. I'll really appreciate if you could help me out!

alihamraoui commented 5 months ago

Hi @ekaj10, thanks for using ToulligQC.

The issue you're encountering occurs because the newer version of Dorado outputs the column name as "barcode" instead of "barcode_arrangement," which ToulligQC expects.

You can temporarily fix this by modifying the column name in your file:

sed -i '1s/^barcode,/barcode_arrangement,/' /path/to/sequencing_summary.txt

We've addressed this in issue #22 and plan to release the fix next week.

In the meantime, you can install the latest version of ToulligQC directly from GitHub:

git clone https://github.com/GenomicParisCentre/toulligQC.git
cd toulligqc && python3 setup.py build install

Please let me know if you need any further assistance!

best, Ali