mjoppich / pIMZ

This framework allows to perform scRNA-seq-like analyses of imaging mass-spectrometry data. Check out the example jupyter-notebook in examples/
MIT License
7 stars 0 forks source link

Why exclude 0 values in `_get_median_spectrum` and also add `startedLog` with median values of all channels/z axis of regions #5

Open ranabanik opened 3 years ago

ranabanik commented 3 years ago
if startedLog == 0:
    startedLog = 0.001
self.logger.info("Started Log Value: {}".format(startedLog))
median_profile += startedLog

The code chunk can be found here.

mjoppich commented 3 years ago

This is a pseudo-intensity which is added to the median spectrum.

This is used to ensure that any division with the median spectrum will not result in a division by zero.

Indeed, one could infer the pseudo-part dynamically from the spectra. But since this only happens if the 5% quantile is 0, we considered the chances of any negative effects quite low. Is this particular decision bringing you any negative side-effects?

ranabanik commented 3 years ago

I was trying to make sense of the plot_fcs function. This function was used there. Since the 0 values were excluded from the spectrum are there any chances of receiving 0 value in any quantile? Could you specify what you meant by 'negative side effects'

mjoppich commented 3 years ago

Introducing pseudo-intensities or pseudo-counts always has the problem of introducing negative side-effect, such as a lower dynamic range in any quotient, as pointed out by you. Thus, if one wants to introduce such a pseudo-intensity/count one should try to have it as low as possible in order to not mess up with this.

In this particular use-case this pseudo-intensity does not influence the chances of receiving 0 values (because it can reach 0 if the specific field in rspec is 0 :) Here, it just leads to a minimal skew (due to pseudo-intensity), but most importantly, avoid division by zero.