Closed b2jia closed 2 years ago
Hi @b2jia, you're right that the 3rd last field is the haplotype, and the 2nd last field is mapping quality. You can find relevant code in the file hickit.js.
I'm not sure what the last field is. However, as far as I understand from code in io.c, the last field is NOT used in downstream analysis -- @lh3 please correct me if I'm wrong.
The last field counts the number of distant segments a read pair has. It is not actually used IIRC.
Thank you both!
The current documentation on the seg format is sparse. Can someone help disambiguate the seg format?
Below I've copied an example Dip-C read (GSE162511), delimited by
!
. While the first 3 fields are more or less decipherable, what do the last 4 fields correspond to?chromosome | start coordinate | end coordinate | strand | haplotype? | score? | ? (always 1 or 2)