vibansal / HapCUT2

software tools for haplotype assembly from sequence data
BSD 2-Clause "Simplified" License
207 stars 36 forks source link

Enable the use of the MI tag to detect molecules #114

Closed pontushojer closed 3 years ago

pontushojer commented 3 years ago

Thank you for developing this excellent tool!

This PR adds support for using the MI tag to link fragments for haplotype phasing with HapCUT2.

The current method for detecting molecules in LinkFragments.py is based on finding all reads sharing a barcode within a defined distance. Some tools however already define molecules using the MI tag, for example 10x longranger and ema. The reads comprising the molecule defined by the MI tag might differ from those defined by the current method. For example, not all reads from longranger are tagged with MI tag.

vibansal commented 3 years ago

Thank you for adding this functionality.