TaoYang-dev / hicrep

R package to evaluate the reproducibility of Hi-C data
25 stars 4 forks source link

🐛 Fix a critical bug in hic2mat/bed2mat #66

Closed haizi-zh closed 3 years ago

haizi-zh commented 3 years ago

This bug leads to an incorrectly shifted Hi-C matrix.

In the code base, the only place where bed2mat is invoked is within hic2mat (https://github.com/MonkeyLB/hicrep/blob/master/R/hic2mat.R#L25). hic2mat first convert the .hic file to a BED-like data frame (via strawr::straw), then bed2mat takes over and calculates the R matrix that meets hicrep input format requirements.

However, the output of strawr:straw is a BED-like structure as follows (the bin size in this example is 500kb):

        x       y counts
1       0       0    442
2       0  500000     16
3  500000  500000    441
4       0 1000000     11
5  500000 1000000     46
6 1000000 1000000    452

strawr::straw denotes the "first bin" as 0. Therefore, during the conversion, the correct way of dealing with subscripts is shown here in this pull request.