ChristopherWilks / megadepth

BigWig and BAM utilities
Other
91 stars 9 forks source link

Bigwig coverage from annotations #6

Open jeskowagner opened 3 years ago

jeskowagner commented 3 years ago

Hello,

Thanks a lot for creating megadepth! Its bam processing capabilities seem great and extremely fast.

I was wondering whether some bigwig functionalities could still be implemented. Specifically, I was hoping for a more efficient version of bwtool's matrix function, which is comparable to deeptools' computeMatrix. That is, functionality to extract base-resolution coverage from a bigwig for each region from an annotation file (in BED6).

The motivation behind this is that during conversion from bam to bigwig various corrections may be conducted (e.g. fragment centering) that would be difficult to include in the bam format.

Thanks in advance and best wishes, Jesko

ChristopherWilks commented 3 years ago

Hi @jeskowagner,

Thanks for the interest in Megadepth.

I think I'd need to hear a bit more about your use case for the proposed BigWig feature.

While I'm open to extending Megadepth's functionality, my time is limited as it's no longer my primary focus (I'm no longer at Hopkins as a PhD student).

Also, to be clear, while you may indeed be asking for something more, Megadepth does currently support taking a BigWig and and annotation BED file and returning base coverages from the BigWig file summarized over the ranges in the BED file, though it's not a matrix and it's only one BigWig at a time.

e.g.: megadepth <bigwig> --annotation regions.bed --op mean

jeskowagner commented 3 years ago

Thanks a lot for your fast response!

Indeed what I am looking for is similar to existing functionality. Specifically, I would like to keep base coverage per region in the annotation file, instead of aggregating it with e.g. mean across all regions.

The desired output would be a n*m matrix, where n is the number of regions in the regions.bed, and m is the desired window around the center of each region.

A command could look like this: megadepth <bigwig> --annotation regions.bed --op keep --upstream 1000 --downstream 1000

I hope this has provided better insight into the desired input/output. Given your time constraints I fully understand if there is currently no prospect of implementing this, but given the similarity to existing functionality I was hoping there may be some chance that it is not too much of a hassle.

Cheers, Jesko

ChristopherWilks commented 3 years ago

Hi @jeskowagner

This is just an update to say I haven't forgotten this request, and I actually did make some progress in implementing it, but I ran into an issue with Megadepth's output not matching bwtool's which I'm using as a "gold standard" for this feature. It'll require some more work to figure that out and I'm fairly busy right now, so I can't promise when I'll get to it, but I'll keep you informed.

Chris

jeskowagner commented 3 years ago

Thanks for the update! Since (as far as I can tell) bwtool is not under active development I would not be surprised to hear of any bugs in its output. Perhaps deeptools is a more suitable, albeit slow, reference. I fully understand your time limitations and very much appreciate your efforts. I also want to mention that this issue is not high priority for me, but rather something that I thought could fit nicely into the scope of megadepth. I apologize that I cannot contribute to the development efforts, as I am not (yet) C++ literate.

Best, Jesko