apcamargo / pycoverm

Simple Python interface to CoverM's fast coverage estimation functions
GNU General Public License v3.0
7 stars 2 forks source link

Sort returned output #4

Closed jakobnissen closed 3 years ago

jakobnissen commented 3 years ago

The output of get_coverages_from_bam is in a haphazard order. It would be nicer if it was returned in the same order as in the BAM files - which is what CoverM returns when called from command line.

apcamargo commented 3 years ago

Hi @jakobnissen! Are you referring to the order of the contigs in the array?

The reason for this is that the contigs (and their coverage vectors) are being stored in a HashMap, which doesn't preserve the order. I did this to allow the contig_set parameter, whose purpose is to reduce memory usage when you're only interested in a set of contigs (e.g.: only binned contigs).

If that's indeed your problem, I think I can fix this quickly using an ordered hash table.

apcamargo commented 3 years ago

pyCoverM now returns the contigs in the same order as in the BAM files. Let me know if that solves your issue!

jakobnissen commented 3 years ago

That's exactly what I was looking for, thank you!