Altius / hotspot2

Implementation of hotspot2 by Eric Rynes
16 stars 3 forks source link

BRM::computePandFlush() needs to always call BRM::computeStats(-1) #8

Closed erynes closed 8 years ago

erynes commented 8 years ago

Currently, when BackgroundRegionManager::computePandFlush() gets called, it checks to see if the estimated null region of the distribution has changed, and if so, it calls computeStats(-1). If the null region cutoff needs to be computed (!m_sliding) or recomputed (m_needToUpdate_kcutoff == true), then findCutoff() is called and then computeStats(-1) is called.

The following run case was observed: A string of sites with count=7 exited the window to the left, while a string of sites with count=7 entered the window to the right, so that the distribution didn't change for a few bp, and then there was a huge gap (100k+ unmappable region on chr1). The gap caused computePandFlush() to be called, as it should have, but there were several high counts in the window that still needed P-values to be computed for them, and the fact that the distribution was "up to date" upon entry caused computePandFlush() to assume no further P-value calculations were needed for the window. This in turn caused P=-1 to be output for stretches of sites.

While there will be situations in which there's no need to call findCutoff() from computePandFlush(), I think we should always call computeStats(-1) from computePandFlush(), no matter what. In some cases it might be a bit of overkill, but only a bit, with a negligible cost to run time.

erynes commented 8 years ago

Closed via commit 78dd804d4005bc72f76650a0168a01fccb90f054.