GenomiqueENS / toulligQC

A post sequencing QC tool for Oxford Nanopore sequencers
Other
87 stars 8 forks source link

issue for invalid file name #12

Closed ogrecio closed 2 years ago

ogrecio commented 2 years ago

Hi, I'm running toulligc in my Ubuntu system, I have installed it as a python package with pip3 and using an environment. When I run it I have the following problem: `/home/grid/programas/ToulligQC/bin/toulligqc --report-name test --fast5-source /data/test/20220404_1721_X5_FAS58594_1dd0a346/fast5_pass/ --sequencing-summary-source /data/test/20220404_1721_X5_FAS58594_1dd0a346/sequencing_summary_FAS58594_6132143a.txt --barcoding -l BC01,BC04,BC05,BC06,BC07,BC08,BC09,BC10,BC11,BC12 --html-report-path test.QC.html test.QC.html ToulligQC version 2.2.2

Could you please help with me this? Thank you!

jourdren commented 2 years ago

Dear ogrecio,

Thanks for reporting this issue. After investigation, it seems that this error occurs when the Fast5 source is a directory and did not contains any Fast5 file.

I've just fix this issue with 3b1fda2ded3982d5fa9b8719befe8c3968bad881commit. A bugfix version of ToulligQC will be soon published.

Until, the publication of this new version, you can easily avoid this issue by setting the full path of a Fast5 file instead of its directory.

Best regards, Laurent.

ogrecio commented 2 years ago

Hi Laurent, thank you for that. I quite not understand how to solve the issue. The fast5 source is certainly a folder, and the fast5 files are withing subfolders according to barcodes (each folder possibly many fast5 files). If I'm interested on having the report for all barcodes, what coomand am I supposed to run?

Here is the list of folders in the fast5 source. Thank you

ll fast5_pass/ total 60 drwxrwxr-x 15 grid grid 4096 Apr 4 17:36 ./ drwxrwxr-x 7 grid grid 4096 Apr 6 12:22 ../ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode01/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode02/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode03/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode04/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode05/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode06/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode07/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode08/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode09/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode10/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode11/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 barcode12/ drwxrwxr-x 2 grid grid 4096 Apr 6 12:22 unclassified/

jourdren commented 2 years ago

Hi ogrecio,

Providing a Fast5 file to ToulligQC is just a fallback solution if you do not have a sequencing_telemetry.js file. With a sequencing_telemetry.js file you will have more information in the "Run statistics" and "Device and software" tables than with a Fast5 file. However both sequencing_telemetry.js and Fast5 files are optional when running ToulligQC.

If you choose to use a Fast5 file, you just need only one Fast5 file as ToulligQC will only use some run metadata in the Fast5 file (run metadata are duplicated in all Fast5 files of a run).

The only mandatory file for ToulligQC is the sequencing_summary.txt file that contains the barcoding information (the barcode_arrangement column in this file) and statistics about each read. Barcoding informations can be in separate files if barcode is performed after basecalling, that's why you can provide more than one sequencing_summary.txt to ToulligQC. However there usually only one sequencing_summary.txt file with all information after basecalling and demultiplexing.

Best regards,

Laurent.

jourdren commented 2 years ago

Issue fixed in ToulligQC 2.2.3.

ogrecio commented 2 years ago

Fixed. Thank you Jourdren.