4dn-dcic / pairix

1D/2D indexing and querying on bgzipped text file with a pair of genomic coordinates
MIT License
86 stars 14 forks source link

pypairix linecount no longer reverting to int #61

Closed SooLee closed 5 years ago

SooLee commented 5 years ago

The issue with integer overflow with get_linecount of pypairix is now fixed. (no need to re-index)

Before:

>>> import pypairix
>>> tb = pypairix.open("mounted/4DNFIXX3YX2S.pairs.gz")
>>> tb.get_linecount()
-1769618406

After:

>>> import pypairix
>>> tb = pypairix.open("mounted/4DNFIXX3YX2S.pairs.gz")
>>> tb.get_linecount()
2525348890

In addition, I found out that on some systems (possibly gcc version but not clear), autoflip causes segmentation fault or returns an empty result - this is fixed now. This affects pairix -Y, pairix -a, pypairix.check_triangle() and pypairix.querys2D(). It does not affect the results on 4DN Data Portal.

SooLee commented 5 years ago

@carlvitzthum The version unmatch issue in test.py happens with python 3.6 but not with python 2.7.