deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

hicAverageRegions ValueError inconsistent shapes #340

Closed C-L-Wike closed 5 years ago

C-L-Wike commented 5 years ago

Why does hicAverageRegions work great for the domain.bed and boundaries.bed files made from hicFindTADs, but not from a bed file for a ChIP MAcs2 peaks

Is there a specific formatting of the ChIP bedfile that is missing that the Domain.bed or boundaries.bed has?

thank you

gtrichard commented 5 years ago

Hello and thank you for using HiCExplorer.

Can you define what is not working? Do you have any output? What is the error?

hicAverageRegions tends to work well for regions with an expected distribution of contacts around them, like TAD boundaries for example.

Depending on which ChIP peaks you are looking at, it is not sure to have an expected distribution of contacts around them. Hence if you average the Hi-C contacts nearby them, you'll only plot the distance-dependant contacts decay.

Maybe hicAggregateContacts with the obs/exp transformation will work better to know if these ChIP peaks are spatially clustered together.

brainfo commented 5 years ago

Hello and thank you for using HiCExplorer.

Can you define what is not working? Do you have any output? What is the error?

hicAverageRegions tends to work well for regions with an expected distribution of contacts around them, like TAD boundaries for example.

Depending on which ChIP peaks you are looking at, it is not sure to have an expected distribution of contacts around them. Hence if you average the Hi-C contacts nearby them, you'll only plot the distance-dependant contacts decay.

Maybe hicAggregateContacts with the obs/exp transformation will work better to know if these ChIP peaks are spatially clustered together.

Hello, I've encountered problem that when I use hicAverageRegions around TAD boundary in 200kb regions, I can get normal result, but if I use it in 500kb regions, it triggers ValueError: inconsistent shapes. And for CTCF summit bed, even 200kb cannot work. Do you know why that happens and would you kindly tell me how to do it? My matrix resolution is 5k and no boundary start within first 500kbp of any chromosome.
Thank you!

joachimwolff commented 5 years ago

Hi,

Can you check if one region is close to the start / end of a chromosome and with the 500kb instead of 200kb upstream / downstream it is crossing this boundary?

Best,

Joachim

brainfo commented 5 years ago

Hi,

Can you check if one region is close to the start / end of a chromosome and with the 500kb instead of 200kb upstream / downstream it is crossing this boundary?

Best,

Joachim

Thank you very much! it happens 'cause my bed file contains regions near the end of some chromosomes.

Thanks!

joachimwolff commented 5 years ago

Ok, it is a workaround to remove these regions, but needs to be fixed from our side in the long run.

joachimwolff commented 5 years ago

Should be fixed with release 3.3.