Closed mbhall88 closed 6 months ago
Yes this can happen because of read splitting. The original reads from pod5 are split into subread which have different read ids. However each of those records will have a pi:Z
tag which point to the original read id they came from.
Could you explain what will cause a Pod5 read to be split?
Issue Report
Please describe the issue:
I extracted the read IDs from a fastq that comes from a sup dorado basecalling run (v0.5.0). I then passed those read ids to
pod5 subset
and pointedpod5
at the pod5s I basecalled and there are read ids from the fastq that do not exist in the pod5s. And indeed, when I inspect the pod5 file a particular fastq read comes from (using thefn:Z:<fname>
tag) that read id doesn't actually exist in there...Is this a known thing?In the runs I am working with there are 2494275/34092978 (7.3%) fastq reads with no associated pod5 read.
Run environment: