nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
452 stars 53 forks source link

New version of MinKNOW stuck at 99% completion of basecalling step #711

Open VahidJavaran opened 3 months ago

VahidJavaran commented 3 months ago

Hello, I recently updated our MinKNOW software to version 24.02.10. Following the update, I attempted to basecall some POD5 files. However, the basecalling process seems to be stuck. It reached 99% completion but hasn't progressed any further after two hours. image

MarkBicknellONT commented 3 months ago

Hello @VahidJavaran , We have identified a very similar issue internally, where post-run basecalling runs can get stuck like this. We have a fix but I'd like to confirm it will also solve your issue. It looks from your screenshot that you had two post run basecalling experiments running at the same time - were they using different basecalling configs? If you could contact TS via the community and send them the logs from minknow we can take a look at them and check that it is the same issue. Kind regards, Mark

VahidJavaran commented 3 months ago

Hi @MarkBicknellONT,

Thank you for your response. To clarify, I executed them independently, not in parallel. In spite of the freeze, the output fastq files were usable, so I went ahead with barcoding. In order to raise awareness about this situation, I plan to share it with the Nanopore community as well.

aforestsomewhere commented 3 months ago

I have also seen this happen with post-run SUP basecalling on our P2 solo today - no other basecalling experiments running at the time. Minknow version 24.02.10, fastqs seem to have generated correctly - is there a fix that can be generally shared?

MarkBicknellONT commented 3 months ago

Hi @aforestsomewhere and @VahidJavaran ,

Thanks for your reports - there's a software fix for post-run basecalling hanging when almost complete, which we will be releasing in MinKNOW 5.9 Patch 1 as soon as possible. Your issues seem likely to be the same problem.

The bug causes a very small number of reads to become stuck in the basecalling process. This means that in the meantime, the vast majority of the source reads should be in the emitted fastq files, and the files should be well formed.

Apologies for the bug!

Kind regards, Mark

aforestsomewhere commented 1 month ago

Hi @MarkBicknellONT ,

I've updated to 24.02.16 but still seeing this issue of basecalling hanging at 99% - is patch 1 included in this version on Minknow?

MarkBicknellONT commented 1 month ago

Hi @aforestsomewhere ,

Yes, this version does include the fix we deployed for this type of hang. It sounds like you have a different mode of failure so we need to keep investigating. Can I ask you to contact the ONT Technical Support team via https://nanoporetech.com/support please? If you hit "contact support" on that page you will be directed to the chatbot, where you can go "See all topics" -> "Software" -> "MinKNOW" -> "Linux" -> choose anything -> scroll down and hit "I still need support" -> "Technical/Product Support" -> enter your email -> "Create a support ticket". You will then be able to log a support ticket with the team. If you can attach your dorado server logs and the logs from your post-run basecall output folder, we can have a look at the issue. I've primed TS to expect your message.

Thanks! Mark