Closed tw164 closed 7 years ago
This change may introduce a potential problem in the calculation of the sublocalized best location (lines 418-421 in this version). Clamping indices to the interval [0,T) helps keep LOD array access within bounds, but can imbalance the range of bins used at chromosome ends, which has the effect of 'pulling' the best location away from the nearest terminal bin. Is there a way to rebalance the calculation of sublocalized best location? Or would it be better to output an 'NA' value in such cases?
Thanks! And sorry for the long delay in processing this. This all looks good, but I will have to think more about the point you raised in your comment. The (potential) edge effects have been something that have bothered me since the beginning of this project, and I still don't have a great answer.
Handling of offset bin positions was not implemented consistently. This has resulted in offset bin positions being used in plotting output, but not in summary text output.
Change function
load_table
to return an array of bin edges, which is a numpy array of integer bin start positions, followed by the end position of the last bin (all in base pairs). Wherever bin starts only are used, these are explicitly set from thebins
array to a variablebin_starts
.Change function
doComputation
to accept bin edge positions, and to index into these positions with the interval edge variablesleft
andright
. Bin edges are needed because theright
index may beT
in some cases where an interval extends beyond the rightmost bin. In such cases, the rightmost bin edge position is used. Ensure left indices of intervals are greater than or equal to zero, and right indices are less than or equal toT
.Change function
doPlotting
to set the horizontal plotting positions inX
from the mid-points of the offset bins, to draw the 90% Bayesian credible interval between offset bin positions, and to plot the axis between the first and last bin edges.