Closed SziKayLeung closed 2 years ago
Hello Szi,
We have modified guppy to take slow5, however ONT don't allow us to make this publicly available.
We have both a bonito and dorado version which will basecall slow5. You can find them as closed pull requests on the ont repos of those 2 tools, where ONT also refused to integrate slow5.
Let me know if you need help finding those pull requests if you want to go that way. Otherwise, I only think we can share the guppy builds if you sign the ont developer agreement (which I can't really give advice on if that is worth it or not for you).
Kind regards, James
To add to what @Psy-Fer said, we have created a binary build of the slow5 version of Dorado https://github.com/hiruna72/dorado/releases/tag/v0.0.1. You could try that. Instructions are there.
Alternatively, you could use s2f to convert back to FAST5 into a temporary directory and use Guppy on it.
The bonito pull request is here: https://github.com/nanoporetech/bonito/pull/252
The Dorado pull request is here: https://github.com/nanoporetech/dorado/pull/19
Thank you Jamies and Hasindu. That's really helpful to know, and will try your Bonito/Dorado versions.
@hasindu2008 also had a crazy idea last night thinking about this.
So stay tuned for updates (if it works)
I am also currently dealing with this same situation! The slow5 format looks excellent and im very excited to try this in our workflow. I am setting it up on a gridion and hoping to use the blow5 with guppy.
Wouldnt converting blow5 back to fast5 defeat the purpose computationally? (aside from less space)
Thanks for this awesome tool @Psy-Fer, @hasindu2008, @SziKayLeung, and team
@lacoak21
You are right. If we are directly basecall from S/BLOW5, it is much faster. But unfortunately, Guppy is closed source and despite us having a version of slow5-Guppy through the developer agreement, the terms of the agreement do not allow us to release it.
@Psy-Fer is doing a workaround for this, will let you know the outcome soon. In the long run however, given that ONT has announced that their Dorado opensource basecaller is going to replace guppy as the mid-term plan, we will release our own forked version of Dorado with S/BLOW5support.
However, even if you convert to fast5, still having S/BLOW5would be beneficial not just in terms of space, but the possibility of running other community-developed tools such as nanopolish, f5c, etc, a magnitude of times faster than using fast5. If there are any community-developed tools that you use and want us to have a look into supporting S/BLOW5, let us know.
Also, community developers could focus on the actual research problem rather than wasting 2/3rd of their effort/time on understanding and dealing with complex, idiosyncrasies and ad-hocness of FAST5. S/BLOW5is also about the human-efficiency and not just compute efficiency. Here is a post I wrote about the design philosophy of S/BLOW5. https://hasindu2008.github.io/slow5specs/design.html
Thanks so much for your response @hasindu2008!! I had not realized that ONT is moving to Dorado.
After reading the design philosophy, this clearly will clearly help us a lot in the long run.
Dorado is still in "preview" release and isn't yet feature complete. So probably won't be till next year when that is at a stage to be production ready. So guppy is still very much the way to go.
In some good news, I have a prototype that solves this issue with slow5 not working with guppy, and us not able to share our slow5 compatible version of guppy.
It's still a little rough and needs a little testing and benchmarking, but it should be ready for an alpha release next week for you to try if you are keen.
Stay tuned! James
Hey,
give this a try
https://github.com/Psy-Fer/buttery-eel
Basecalling with guppy.
Thanks for this issue, it helped us think about other ways to go about this.
Have fun James
Wow thanks James and Team! This is an exciting update.
Luisa
Thank you so much James et al. Will try this out and let you know how I get on!
Hello @hasindu2008,
Thank you for developing slow5tools - it's been really useful to compress and store ONT data!
Apologies if this is a naïve question but I thought it was possible to basecall blow5 files using Guppy. I have successfully converted my fast5 files to slow5, and am trying to basecall, however I get the following error:
With the commands (slow5tools 0.5.1):
Am I missing something/argument or is basecall only possible with the original raw fast5 files?
Thank you, Szi Kay