Closed sergpolly closed 5 years ago
after some digging, it turned out of course, that:
https://github.com/mirnylab/cooltools/blob/74bcfe4f34f7948649e7678f33d31812bca920d7/cooltools/saddle.py#L88
is not the place one has to change to undo trim_outliers
thing, but there is a little thing in that line still:
x = x[(x > 0) & (x < len(binedges) + 1)]
this (x < len(binedges) + 1)
seems redundant , as during digitization there will be up to len(binedges)
values, not up to len(binedges) + 1
: e.g. for 2 binedges ||
-> digitized values 0|1|2
.
so, if we really want to "trim outliers", this line should be:
x = x[(x > 0) & (x < len(binedges))]
the number of elements in the hist/count that is returned by the bincount
:
https://github.com/mirnylab/cooltools/blob/74bcfe4f34f7948649e7678f33d31812bca920d7/cooltools/saddle.py#L89
should be len(binedges)+1
, indeed. That is really needed, at least because later on we trim the saddledata
matrix and hist/count
, assuming they are len(binedges)+1
by len(binedges)+1
, and len(binedges)+1
correspondingly:
https://github.com/mirnylab/cooltools/blob/74bcfe4f34f7948649e7678f33d31812bca920d7/cooltools/saddle.py#L342
this is actually done without checking if hist/count
is indeed of len(binedges)+1
size...
looks like a dead end - as it is hard to deal with "half-open" bins @Hbelaghzal found other solutions:
qrange
, range
to include whatever is needed ...
saddle-cli
to show preferential interactions using "different" tracks - like histone modifications, etc,range
and/orqrange
, to make the "saddle"-interaction plot "look right" with these "custom" tracksrange/qrange
in the histogram, i.e.track-values
that are end up being assigned to bin0
and binlen(binedges)+1
hist
: https://github.com/mirnylab/cooltools/blob/74bcfe4f34f7948649e7678f33d31812bca920d7/cooltools/saddle.py#L88it should be quick and easy, the only question is, does it sound like a useful/generic thing for other people or is it too specific - "one off type of thing" to add to the mainline ? @nvictus @golobor ?