skovaka / uncalled4

MIT License
43 stars 3 forks source link

guppy is so slow #21

Closed abhhba closed 5 months ago

abhhba commented 6 months ago

When I use guppy to basecall fast5 files with the --align-ref option selected, the required time increases by several hundred times. My fast5 files are approximately 100GB each. Is this normal? If it is, then preprocessing a single 100GB fast5 file would take several months to complete.

skovaka commented 6 months ago

I have not noticed this, although I haven't performed a direct comparison. Are you using a GPU? You could try running with --bam_out instead, which will output unaligned BAM files that you can convert via samtools fastq -T "mv,ts,pi,sp,ns"... and align using minimap2 -y... (see README overview). You could also try Dorado, which might be maintained better than Guppy now

abhhba commented 5 months ago

Thank you very much for your explanation. I have used Dorado, and it has significantly improved the speed. Now I can generate BAM files with the MV tag correctly.