ktan8 / nanopore_telomere_basecall

14 stars 3 forks source link

Warnning messages regarding fast5 files #7

Closed yjx1217 closed 1 year ago

yjx1217 commented 1 year ago

Hello, I was trying to run fullpipeline.pl with the following command:

perl fullpipeline.pl $prefix.preliminary_basecalling.fasta $fast5_dir $prefix

The processes finished with the following warning message:

DEBUG:h5py._conv:Creating converter from 5 to 3 | 0% ETA: --:--:-- | 148 of 148|#################################################################################################################################################|100% Time: 0:00:00 INFO:Fast5Filter:0 reads extracted WARNING:Fast5Filter:112 reads not found!

reading fast5 outputting unaligned fastq loading model ./3_basecall_problematic_reads/../1_bonito_basecalling_model/chm13_nanopore_trained_run225/ completed reads: 0
duration: 0:00:00 samples per second 0.0E+00 done

I was wondering if the run has actually finished successfully or something has gone wrong regarding fast5 file processing?

Notably, I have a sizeable $prefix.telomerefixed.fasta.gz but the $prefix.correctedreads.fasta.gz file is empty. Also, the summary tsv file and the $prefix directory is empty. So I guess something went wrong.

Thanks in advance!

Best, Jia-Xing

yjx1217 commented 1 year ago

Oh, I figured it out! The warning (or more exactly the error) was triggered due to fast5subset cannot find the query fast5 reads based on the .readname file because those extra strings (e.g., "runid=ad4979dba383c166eb35a93b4e7c7367711fa066 read=25 ch=203 ...") after the read id. By applying "cut -d " " -f 1" to the .readname file, the full protocol can work now.