AdvancedPhotonSource / xdesign

Tools for designing x-ray phantoms and experiments.
https://xdesign.readthedocs.io
Other
24 stars 16 forks source link

acquisition.Probe noise model #39

Open malte-storm opened 7 years ago

malte-storm commented 7 years ago

The noise parameter in the sinograms only increases the absorption, never decreases it, see image: xdesign_noisecomparison_line

This introduces a statistical bias. I think the reason for this behaviour is that noise is added for the attenuation data and not for the measured data

def measure(self, phantom, noise=False):
    [...]
    if noise > 0:
        newdata += newdata * noise * np.random.poisson(1)
    self.record()
    return newdata

The way a "normal" tomography measurement works is by dividing (dark-current normalized) camera images: projection divided by flat-field. These images both have Poisson-distributed counting errors. The variance variance of the Poisson distribution equals lambda: VAR = sigma^2 =lambda For a noise level equivalent to one standard deviation (i.e. sigma = noise ), this corresponds to a count rate of counts = noise^(-2) I would suggest the noise application to be done in the following manner:

    if noise > 0:
        counts = noise**(-2)
        newdata = -numpy.log(numpy.random.poisson(counts * numpy.exp(-newdata)) / numpy.random.poisson(counts))

This approach creates a nice and uniform histogram of the noise around the expected value. However, this approach would require that newdata is scaled reasonable, i.e. ideally newdata would be in [0, 3] and would support using "real world units". It is obviously also more computational intensive than the old approach but considering the time required for computing all phantom intersections, I expect this to be negligible.

This is an example of how the histograms look like (500,000 random number pairs) with a transmission value of 0.25 (exp(-newdata) = 0.25): xdesign_statistics

carterbox commented 7 years ago

@ogurreck I'm more of a computer vision and computational geometry person, so thanks for your patience while I try and understand the details of your improved noise model.

However, this approach would require that newdata is scaled reasonable, i.e. ideally newdata would be in [0, 3] and would support using "real world units".

Your proposed noise model makes sense to me as far as providing the correct noise distribution by converting noise into the mean photon density for the Probe. However, I'm unclear about why newdata should be scaled in the range [0, 3] and the implications this improved model has for our planned enhancement of real world units.

...considering the time required for computing all phantom intersections, I expect [computation for the more complex noise model] to be negligible.

I agree. Intersection calculations will take a majority of the computation time.

For a noise level equivalent to one standard deviation (i.e. sigma = noise ), this corresponds to a count rate of counts = noise^(-2)

In your improved model, would noise be documented as "the standard deviation of the detector count"?

malte-storm commented 7 years ago

I was suggesting newdata to be in the range [0, 3] because this range is statistically sound. In an experimental tomography setup (at a synchrotron with monochromatic beam), one should aim for sample transmission of T=exp(-2.2) = 0.11. The information about the sample absorption is mapped to the range [T, 1] with T< 1. If T becomes smaller than 0.11, the noise in the data dominates. If T is much larger, the information about the sample is encoded in a smaller interval. As the relative error sigma(T) is constant (given by the number of counts), a larger T value corresponds to a larger relative error in the abosrption information about the sample. These calculations originate in the PhD thesis from T. Donath (full-text link: https://www.hzg.de/imperia/md/content/hzg/zentrale_einrichtungen/bibliothek/berichte/gkss_berichte_2007/gkss_2007_17.pdf pp 31-32 ).

In the xdesign nomenclature, newdata = -log(T) (natural logarithm), i.e. best statistics are found for np.amax(newdata) == 2.2. This is why I was suggesting a proper scaling of newdata. Without scaling, the noise would dominate the system in this approach and the results would not really be applicable. The following histogram illustrates this point: The variable p is used in place of newdata, i.e. the measured intensity is T = exp(-p). Obviously, the relative difference of the parameters [0.005, 0.01, 0.015] and [0.5, 1.0, 1.5] is identical, but the absolute scale is different. The result for the noise distribution is obvious. xdesign_scaling

In your improved model, would noise be documented as "the standard deviation of the detector count"?

The standard deviation of the detector count would be 1/noise. This is not the most intuitive parameter to use. An alternative could be introducing a "flat-field detector counts" keyword which would directly relate into counts. This approach is quite intuitive as people can easily grasp the differences between 1,000 or 10,000 detector counts.

carterbox commented 7 years ago

It seems like your noise model depends on having all of the data available (otherwise you can't know the maximum value in order to choose appropriate scaling factor). Thus it would probably be better to add noise as a post-processing method instead of in Probe.measure().

Otherwise, can you think of a way such that the noise can be computed without knowing in advance what the maximum value of the sinogram (or other data scheme) will be?

dgursoy commented 7 years ago

Ideally one can use Poisson noise, and add it on-the-fly, but then another scaling factor should be provided for the source (e.g. counts, detector, readout, etc.). Maybe best is to add flux as a property of probe.

malte-storm commented 7 years ago

It seems like your noise model depends on having all of the data available (otherwise you can't know the maximum value in order to choose appropriate scaling factor). Thus it would probably be better to add noise as a post-processing method instead of in Probe.measure().

True on both accounts. Therefore, I suggested using real world units. That would allow a proper estimate of how noisy reconstructed data would be and what type of features are still discernable. I also don't see a reason not to add noise as post-processing, because it is calculated for each pixel in the sinogram anyway. Scale the sinogram and add the noise after the sinogram calculation is completed.

Otherwise, can you think of a way such that the noise can be computed without knowing in advance what the maximum value of the sinogram (or other data scheme) will be?

As we are dealing with large numbers of counts on the detector (N >> 1), you could approximate the Poisson noise with a normal distribution. That should give about the same effect.

Ideally one can use Poisson noise, and add it on-the-fly, but then another scaling factor should be provided for the source (e.g. counts, detector, readout, etc.). Maybe best is to add flux as a property of probe.

As written above, I don't think one can reasonably add Poisson noise on the fly without knowing all the parameters (proper attenuation coefficients) because these scale the distribution function. For on-the-fly noise generation, I would suggest using a normal distribution, which behaves quite similarly for large count numbers.

carterbox commented 7 years ago

I propose the following patch for now:

def measure(self, phantom, sigma=0):
"""Measure the phantom with optional Gaussian noise.

sigma is the standard deviation of the normally distributed noise.
"""
    [...]
    if sigma > 0:
        newdata += newdata * np.random.normal(scale=sigma)
    self.record()
    return newdata

And sometime in the future, a more accurate noise model and Detector can be implemented OR a maybe submodule of common distortions can be implemented.

I think we want to refrain from modeling anything detector dependent such as count scaling in the Probe class because maybe not everyone wants a conventional detector.

malte-storm commented 7 years ago

Your proposed patch should fix the issue quite well.

And sometime in the future, a more accurate noise model and Detectorcan be implemented OR a maybe submodule of common distortions can be implemented.

If you would like and help or input on that, please get back to me. I'd be happy to help.