nanoporetech / ont_fast5_api

Oxford Nanopore Technologies fast5 API software
Other
145 stars 28 forks source link

albacore errors after multi_to_single_fast5.py #7

Closed lcoombe closed 5 years ago

lcoombe commented 5 years ago

Hello, I want to continue using Albacore to extract my ONT reads, but since the newer versions of the software produce multi-fast5 files off the flowcell, I used the multi_to_single_fast5.py script to convert the files to single fast5 prior to albacore.

That step works fine, but when I run albacore I get a bunch of errors indicating that no read data can be extracted from the fast5s:

[lcoombe@hpce704 part0]$ cat pipeline.log 
2019-01-15 10:53:04,059 albacore INFO ONT Albacore Sequencing Pipeline Software (version 2.0.2)
2019-01-15 10:53:04,060 albacore INFO Debug level is 0
2019-01-15 10:53:04,079 albacore INFO Using config file: layout_raw_basecall_1d.jsn
2019-01-15 10:53:04,091 albacore WARNING No option 'pipeline > min_qscore_1dsq' in the config, so using the default of 0.0
2019-01-15 10:53:04,091 albacore WARNING No option 'pipeline > min_qscore_1dsq' in the config, so using the default of 0.0
2019-01-15 10:53:07,428 albacore INFO Submitting file "02c29ce0-9466-42ee-85d8-f435248871b6.fast5".
2019-01-15 10:53:07,437 albacore WARNING Could not extract read data from file "02c29ce0-9466-42ee-85d8-f435248871b6.fast5".

Would there be any reason why the single fast5s produced by the script don't seem to be compatible with albacore?

Thank you!

fbrennen commented 5 years ago

Hi @lcoombe -- are all of your reads failing, or just a few of them? Are you certain that you have raw data in your multi-read fast5 files?

lcoombe commented 5 years ago

Hi @fbrennen ,

Yes, all of the reads are failing - I don't get any output fastqs (pass or fail). And yes, the multi-read fast5 files should all be raw - I was able to open one example file in hdfview, and I can see the signal table fine. Are there any other checks I could do on either the multi-read or single-read fast5s to understand what's going on or know if the data isn't in the expected format?

Also - In case it's helpful, here are the commands I ran:

python3 multi_to_single_fast5.py -i /path/to/multifast5/ -s single_fastqs/ --recursive -t 48
python read_fast5_basecaller.py -r -t 48 -i single_fastqs \
        -s outdir \
        -f FLO-MIN106 \
        -k SQK-LSK108 \
        -o fastq \
        -q 0

Thank you for your help!

fbrennen commented 5 years ago

Hi @lcoombe -- if it's ok with you, can I have one or two of your fast5 files? Ideally one multi and one single. So far I have not been able to replicate this.

lcoombe commented 5 years ago

@fbrennen - Yes, that would be great if you could take a look at a couple of our files. Could you provide an e-mail so I could send you a link to the data privately?

lcoombe commented 5 years ago

@fbrennen - Also, in the meantime, I downloaded a newer version of albacore (2.3.4), and when I test one of the single fast5s, it is able to process it successfully:

Version 2.0.2 (before):

2019-01-16 15:52:51,076 ONT Albacore Sequencing Pipeline Software (version 2.0.2)
2019-01-16 15:52:51,076 Debug level is 0
2019-01-16 15:52:51,446 Using config file: layout_raw_basecall_1d.jsn
2019-01-16 15:52:51,460 No option 'pipeline > min_qscore_1dsq' in the config, so using the default of 0.0
2019-01-16 15:52:51,461 No option 'pipeline > min_qscore_1dsq' in the config, so using the default of 0.0
2019-01-16 15:52:55,443 Submitting file "02c29ce0-9466-42ee-85d8-f435248871b6.fast5".
2019-01-16 15:52:55,451 Could not extract read data from file "02c29ce0-9466-42ee-85d8-f435248871b6.fast5".
2019-01-16 15:52:55,503 Done

Version 2.3.4

2019-01-16 15:50:09,747 ONT Albacore Sequencing Pipeline Software (version 2.3.4)
2019-01-16 15:50:09,747 Debug level is 0
2019-01-16 15:50:09,784 No reference given for module aligner
2019-01-16 15:50:09,784 Using config file: layout_raw_basecall_1d.jsn
2019-01-16 15:50:09,786 1 files found in input folder
2019-01-16 15:50:09,786 include calibration strand detection: True
2019-01-16 15:50:09,786 include alignment: False
2019-01-16 15:50:09,786 include barcoding: False
2019-01-16 15:50:09,787 telemetry enabled: True
2019-01-16 15:50:09,802 No option 'pipeline.min_qscore_1dsq' in the config, so using the default of 0.0
2019-01-16 15:50:10,434 Submitting file "02c29ce0-9466-42ee-85d8-f435248871b6.fast5".
2019-01-16 15:50:11,435 Finished processing file "02c29ce0-9466-42ee-85d8-f435248871b6.fast5".
2019-01-16 15:50:11,495 Done.
2019-01-16 15:50:11,495 Writing telemetry
2019-01-16 15:50:11,499 Summary telemetry written to /projects/spruceup_scratch/pengelmannii/Se404-851/data/DNA/nanopore/extraction/FAK38929/part0/sequencing_telemetry.js
2019-01-16 15:50:11,502 Pinging telemetry to https://ping.oxfordnanoportal.com/basecall

Do you know why the files might be compatible with 2.3.4 but not 2.0.2? Ideally, I'd like to stick with Albacore 2.0.2 to maintain consistency for my project - I have already processed other flowcells for my project with this older version.

Thanks for your help!

fbrennen commented 5 years ago

Hi @lcoombe -- if you email Nanopore customer services they can forward your data on to me to look at. I tested out extracted single-read files with albacore 2.0.2 myself and they worked fine, so it's not clear to me why it would make a difference to upgrade to 2.3.4. Have you tried a new 2.0.2 installation?

lcoombe commented 5 years ago

Sounds good @fbrennen - is support@nanoporetech.com the right e-mail to use?

Very strange that it is working OK for you...I am using the same 2.0.2 installation that we've been using for the past year or so. I could try a fresh 2.0.2 installation - I might be missing something but I can't see an option to download that particular version here? https://community.nanoporetech.com/downloads/albacore/release_notes

fbrennen commented 5 years ago

Hi @lcoombe -- you can go here to contact support: https://community.nanoporetech.com/contact_support (and there's a place to upload files). If you reference this issue they'll pass it on to me.

Albacore versions prior to 2.3.0 are unfortunately no longer available, so if you don't have your original installer then you might be out of luck. :(

lcoombe commented 5 years ago

Sounds good - I just sent a message to support with the link to a couple of example files!

Yes unfortunately I doubt we still have the installer since we downloaded that version so long ago...Although it is strange to me that it would work fine for the single fast5s produced from the older MinION software, but not like the single fast5s produced from the multi fast5s from the new MinION software..

fbrennen commented 5 years ago

I agree -- I didn't see any issue myself with 2.0.2 and files extracted from multi-read ones, so I'll test it out with yours and see if I can figure out what's going on. What platform are you running on, and how did you install albacore?

lcoombe commented 5 years ago

Sounds good - thanks @fbrennen!

I'm running albacore on linux (centOS 7). I didn't actually do the installation myself (I'm using the installation from a previous colleague) but I assume that pip install was used with the .whl file

lcoombe commented 5 years ago

Hi @fbrennen - any updates with this? Thanks again for taking a look at my files!

fbrennen commented 5 years ago

Hi @lcoombe -- support haven't sent me any files yet. :( I'll go see if I can figure out who received them.

lcoombe commented 5 years ago

Thanks @fbrennen - Let me know if you're having trouble getting the links to the files and I can try to send them again or another way :)

fbrennen commented 5 years ago

Hi @lcoombe -- I have gotten ahold of your reads and tried the single fast5 one out with albacore 2.0.2 and everything seems to be working correctly. I'm not really sure what to do next if you can't get an installer for 2.0.2.

lcoombe commented 5 years ago

Hi @fbrennen - Thank you for checking - that is so strange...I suppose I can try running on a couple of different machines on my end to see if that is an issue but I'm not hopeful since I was running on the same machine as I was previously. Anyway - It looks like the easiest thing to do is just use my new albacore 2.3.4 installation (which successfully extracted the reads) and just make a note of that version upgrade in any publications. Thank you very much for your help!

kjestradag commented 5 years ago

Hi @lcoombe - I have the same issue, a little worst, in my case last albacore version (2.3.4) did not fix the problem, I still having the same ..Could not extract read data from file" everyread. fast5 ". Did you find any other solution ? My kit is the SQK-RAD004 (Rapid Sequencing). Thank !!

lcoombe commented 5 years ago

Hi @kjestradag - Unfortunately I ended up just going with version 2.3.4 of albacore, and so far haven't had any further issues.

I don't know if it will help you, but these were the commands that worked for me in the end:

python3 multi_to_single_fast5.py -i /path/to/multi/fastq -s single_fastqs/ --recursive -t 48
python3 ./albacore/2.3.4/venv/bin/read_fast5_basecaller.py -r -t 48 \
        -i $datadir \
        -s $outdir/$nam \
        -f FLO-MIN106 \
        -k SQK-LSK108 \
        -o fastq \
        -q 0

We use the SQK-LSK108 kit.

Sorry I can't be more help!

kjestradag commented 5 years ago

Thank you for your quick answer Lcoombe.

Guppy did the job well and is faster and achieves better results than Albacore, at least in my case.

cheers

On Thu, Feb 28, 2019 at 12:19 PM lcoombe notifications@github.com wrote:

Hi @kjestradag https://github.com/kjestradag - Unfortunately I ended up just going with version 2.3.4 of albacore, and so far haven't had any further issues.

I don't know if it will help you, but these were the commands that worked for me in the end:

python3 multi_to_single_fast5.py -i /path/to/multi/fastq -s single_fastqs/ --recursive -t 48 python3 ./albacore/2.3.4/venv/bin/read_fast5_basecaller.py -r -t 48 \ -i $datadir \ -s $outdir/$nam \ -f FLO-MIN106 \ -k SQK-LSK108 \ -o fastq \ -q 0

We use the SQK-LSK108 kit.

Sorry I can't be more help!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nanoporetech/ont_fast5_api/issues/7#issuecomment-468380097, or mute the thread https://github.com/notifications/unsubscribe-auth/Ap9d7qciWK-yznG0LqkP9l1thRnh5SUyks5vSB2agaJpZM4aBmwN .

-- Karel J. Estrada Guerra tel: (777)3291777 ext.38151, (555)6227777

lcoombe commented 5 years ago

Glad you found a solution - And I know we are looking at testing Guppy vs. albacore in my group, so that's good to know!