Closed chAwater closed 3 years ago
Whoops, you're right! Awhile ago I made the map
PAF output behave similar to realtime
after some code refactoring (i.e. only include how much the read was considered), and forgot to update the README to reflect that. To be honest, none of those read length/position estimates are very accurate since they assume the DNA moves at a constant 450bp/sec, while the average is closer to 400. I plan to improve that in the future, but just went with the easiest solution for now.
You can compute what I used to put in that field by multiplying "duration" column by 450 in the "sequencing_summary.txt" file output by basecallers. A better estimate could be based on "template_duration", which trims a bit of un-basecalled signal.
Would it be helpful to you if I made the read length field estimate based on the full read again? It's not too difficult to fix, but I didn't think it was worth complicating the code since it's such a rough estimate.
I've updated the README to reflect how it actually works, but still open to suggestions.
Thank you for the quick respond and update in https://github.com/skovaka/UNCALLED/commit/0fc1cab738b305bdb2f2b647312f00baac642d3b . It completely solve my question.
I think the estimation is good enough for now, and I have no idea how to improve this. But I really like the idea of realtime
(both in 3rd-gen. sequencing and UNCALLED). So just let it be! :smile:
Hi,
I found a "weird" result in .paf files
Query sequence length
(the 2nd column)Query sequence length
Query sequence length
andQuery end coordinate
(the 4th column)Is that a misplaced property for real-time mode? Where to find the "real" read or event length?
For example:
Output: