a-slide / pycoQC

pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)
https://a-slide.github.io/pycoQC/
GNU General Public License v3.0
258 stars 41 forks source link

EXP-NBD114 support #62

Closed thierryjanssens closed 5 years ago

thierryjanssens commented 5 years ago

Describe the bug pycoQC does not seem to support the EXP-NBD114 expansion.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior The generation of the pycoQC report in html, including the distrubution of the reads over the barcodes.

Screenshots

This is the error log. PARSE DATA FILES Import raw data from sequencing summary files 3,099,954 reads found in initial file Import barcode information from barcode summary files Traceback (most recent call last): File "/path/Anaconda3-5.1.0/envs/pycoqc/bin/pycoQC", line 10, in sys.exit(main_pycoQC()) File "/path/Anaconda3-5.1.0/envs/pycoqc/lib/python3.6/site-packages/pycoQC/cli.py", line 169, in main_pycoQC title=args.title) File "/path/Anaconda3-5.1.0/envs/pycoqc/lib/python3.6/site-packages/pycoQC/cli.py", line 196, in generate_report filter_calibration=filter_calibration) File "/path/Anaconda3-5.1.0/envs/pycoqc/lib/python3.6/site-packages/pycoQC/pycoQC.py", line 94, in init raise pycoQCError ("File {} does not contain required barcode information".format(fp)) pycoQC.common.pycoQCError: File ./fastq_demux/barcoding_summary.txt does not contain required barcode information

Desktop (please complete the following information):

Additional context The same version of pycoQC is processing EXP-NBD104 barcodes flawlessly.

Is there a lack of compatibility?

a-slide commented 5 years ago

Thanks a lot for the very detailed bug report. Looking at the log I assume there is something different in the EXP-NBD114 expansion leading to an error when processing the file. Would you be able to upload the sequencing summary + barcode summary files for the 2 kits so I can have a look? (The first 1000 lines for each should do the trick). Thanks

thierryjanssens commented 5 years ago

I appreciate your swift reply! Hereby I attach the first 1000 lines of two barcode summary files. barcodings_summary_head1000_run7.txt barcodings_summary_head1000_run8.txt

a-slide commented 5 years ago

Thanks. May I ask which one is EXP-NBD104 and which one is EXP-NBD114 ? In addition could I also have the corresponding sequencing summary file so I can test the fix? Thanks

thierryjanssens commented 5 years ago

run7 is EXP-NBD104 (actually EXP-NBD103, but the sequences of the barcodes have remained the same). run 8 is EXP-NBD114 The files are too big to transfer (max 10 MB), can I send them by wetransfer or so?

a-slide commented 5 years ago

Thanks. I will have a loon ASAP. For the sequencing summary the first 1000 lines should do. As for the barcoding summary files

thierryjanssens commented 5 years ago

sequencing_summary_run7_head.txt sequencing_summary_run8_head.txt

thierryjanssens commented 5 years ago

Please find enclosed Iin the field above) the requested files for troubleshooting.

a-slide commented 5 years ago

Hi Thierry, I had a look at the files and I found that the error was caused by the header of the summary sequencing file. I actually found 2 issues :

Because of that, it messes up with pycoQC ability to recognise the proper data fields. I don't know what happened to the summary files you sent but it is not how it is supposed to be generated by Guppy (old unstable version of guppy, files opened with a text editor with auto-correcton ?)

After fixing the header manually I could run pycoQC for both runs without issues.

sequencing_summary_run7_head_fixed.txt sequencing_summary_run8_head_fixed.txt

The barcoding files you sent are similar and there is no reason for pycoQC not to process the new kit the same way as the old one.

a-slide commented 5 years ago

I assume it is fixed now