Open Runsheng opened 6 years ago
I’ll reply more in depth on monday, when I’m back at work :) Pyranges should be able to do this. The repo is on my github :)
Thank you very much! I am now using pybedtools to calculate the intersection between ranges. However, the intersection matrix between 300 mRNA tracks (each contains around 15 ranges) would cost me 400 seconds in a 32 core server. I will try Pyranges first and give you some feedback.
pyranges is still largely unused. I have passing unittests, but it might still have bugs or not work.
I would also look into this potential error in bedtools jaccard: https://github.com/arq5x/bedtools2/issues/645 Whether it is a bug and whether it matters I dunno' :)
Also, if you use pybedtools, it is advisable to presort the data first. It is much faster then.
Is there any method to return the intersection and union between two range in ncls? For instance, range(1,10) and range(5, 15) would return (5,10) and (1,15).
Or just simply return the length of intersection and union like the bedtools jaccards? [https://bedtools.readthedocs.io/en/latest/content/tools/jaccard.html]