nanoporetech / tombo

Tombo is a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data.
Other
225 stars 55 forks source link

[resquiggle] All my reads were unsuccessful #404

Open JAEYOONSUNG opened 1 year ago

JAEYOONSUNG commented 1 year ago

Hello, marcus1487 This is my first methylation analysis with tombo using nanopore sequencing data. I hope you understand, even if this is a rather vague question.

I used the 'tombo preprocess annotate_raw_with_fastqs' command to transfer the basecalling information of the fastq file obtained through guppy to the fast5 file. Then, fast5 and reference were entered through resquiggle, and additionally --num-most-common-errors 5 --overwrite --ignore-read-locks was performed. By the way, it turns out that all reads done to 5 common unsuccessful read types were unsuccessful. (~85%: Poor raw to expected signal matching (revert with 'tombo filter clear_filters', ~15%: Read event to sequence alignment extends beyond bandwith, etc.), So, I tired, --signal-matching-score is high After 'clear_filters' I reset the score to '0.5', but still got the same rate of unsuccessful reads. output is: Filtered 917791 reads (100.0% of previously filtered and 100.0% of all valid reads) reads due to signal matching filter from [my fast5-basedirs]

It seems that my read does not map properly to my reference sequence. Where do I start again? I hope you can check which data was the problem. For reference, the reference fasta file is a sequence corrected by Illumina short read after nanopore sequencing assembly.

Screenshot from 2022-08-01 19-54-08

JAEYOONSUNG commented 1 year ago

I used the miniON platform with R10.3 flowcell for sequencing. As it changes from R9. to R10 version, the fast5 format seems to have changed. I checked the papers of other analysts and found that they were using the R9.x version. Maybe the tombo resquiggle is the problem with the fast5 format? I have no idea what the problem is. We need feedback from developers.