ChristopherWilks / megadepth

BigWig and BAM utilities
Other
91 stars 9 forks source link

0 Signal Across Regions #9

Closed osakarya closed 3 years ago

osakarya commented 3 years ago

Hi,

I am encountering a possible issue which can be seen by the results of the following 3 runs:

Run 1: a single bed region that has positive signal through the whole region: megadepth http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/wgEncodeHaibTfbsH1hescSp1Pcr1xRawRep1.bigWig --annotation test_bed1.tsv --keep-order --op mean

building whole annotation region map done
1 chromosomes for annotated regions read
Processing http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/wgEncodeHaibTfbsH1hescSp1Pcr1xRawRep1.bigWig
chr1    121484600       121484636       24.43

Run 2: a single bed region that has 0 (or NA) signal through the whole region: megadepth http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/wgEncodeHaibTfbsH1hescSp1Pcr1xRawRep1.bigWig --annotation test_bed2.tsv --keep-order --op mean

building whole annotation region map done
1 chromosomes for annotated regions read
Processing http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/wgEncodeHaibTfbsH1hescSp1Pcr1xRawRep1.bigWig
chr1    121481830       121482451       0.00

Run 3: two bed regions from above together (i.e. one region with signal and one with 0/NA signal): megadepth http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/wgEncodeHaibTfbsH1hescSp1Pcr1xRawRep1.bigWig --annotation test_bed3.tsv --keep-order --op mean

building whole annotation region map done
1 chromosomes for annotated regions read
Processing http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibTfbs/wgEncodeHaibTfbsH1hescSp1Pcr1xRawRep1.bigWig
chr1    121481830       121482451       0.00
chr1    121484600       121484636       0.00

Both regions in the final query run have 0.00 signals which I wasn't expecting as such since the first region when run by itself has non-zero results (also confirmed that there is a positive signal by looking at the bigWig manually). I was wondering if this is the expected behavior? To note, above results were observed with both linux exec of 1.0.4 and 1.1.0c release.

ChristopherWilks commented 3 years ago

Hi @osakarya,

Thanks for the bug report. I've been traveling and haven't had a chance to really focus on Megadepth for a while. However, I'm back and will be looking at your issue today.

I'll post back with what I find, Chris

ChristopherWilks commented 3 years ago

I think I found the source of the issue.

The problem is that the main overlap check in Megadepth is only looking for overlaps with the start of the annotated coordinate (e.g. the start of an exon).

It was missing overlaps that don't include the start coordinate, including target regions which are strictly contained within the annotated interval.

This is a serious bug and I'll be releasing a fix for it very soon.

ChristopherWilks commented 3 years ago

I should point out, this bug only affects the BigWig processing against an annotation, it has no effect on the BAM processing in any case (completely different code path).

ChristopherWilks commented 3 years ago

@osakarya I've released a bug fix: https://github.com/ChristopherWilks/megadepth/releases/tag/1.1.1

osakarya commented 3 years ago

Hi Chris,

I've tested this new 1.1.1 release on the examples above. It is no longer reporting 0s. Also manually checked a set of regions on a bigWig from IGV, in which megadepth seems to capture average signal correctly (was testing the mean function), whether run on a single bed region, or on a group of bed regions some of which include null signals.

Thanks for the quick fix, and will let you know if we encounter anything else.

Onur

ChristopherWilks commented 3 years ago

Thanks @osakarya for the response and for reporting back on your testing, that's good to know.