nadeemlab / SPT

Spatial profiling toolbox for spatial characterization of tumor immune microenvironment in multiplex images (https://oncopathtk.org)
https://oncopathtk.org
Other
21 stars 2 forks source link

Fix Ripley metric #342

Closed jimmymathews closed 1 month ago

jimmymathews commented 1 month ago

The current summarization of the Ripley statistic computation is the lowest non-zero p-value for the statistic across the various radius distance scales. But it seems that the typical pattern of these p-values (as returned by the current squidpy implementation) is to exhibit a spike near some range, with zeroes elsewhere, so the lowest non-zero value is almost always 1/101 (the default number of simulations for bootstrapping is 100, and +1 is added for some reason in the squidpy implementation).

Moreover, these p-values are actually clipped by the function p -> min(p, 1-p), erasing the information presented by a value close to 1. This can be corrected by a manual change to a few lines of the squidpy source.

On inspecting the p-values vs. radius profiles that typically occur, it looks like a more useful summarization would be the expected value of the radius/distance value with respect to the p-value "distribution" (treating the profiles as a probability distribution). This tends to pick out the peak where at that radius, the observed density/frequency is generally more than expected for a random point process.