Denis also requested we add outlier rejection to postprocessing. This could be incorporated into an AverageData operator, or there could be two separate operators for averaging with/without outlier rejection.
Determining outliers
First the trimmed mean and trimmed standard deviation will be calculated at each energy point (see here).
For each spectrum the following will be then be calculated:
1 / number of energy points * sum[(trimmed_mean - spectrum / trimmed_stddev)**2]
This value is essentially a measure of how many trimmed standard deviations the spectrum typically deviates from the trimmed mean, and it can be compared to a threshold to determine if a given spectrum is an outlier in the group.
A few notes from Denis:
Different amounts of data can be trimmed during calculation. Typically trimming the top and bottom 20% works well and can be used by default.
A threshold value of ~10-25 typically works well for outlier determination. By default we can use 10 as a conservative threshold.
This method works best when the number of spectra is ≥10. Some kind of warning should be presented if run on a set of less than 10 spectra.
Not entirely sure I understand what the trimmed_mean and trimmed_stddev are, but I'll follow your lead on this. Thanks for documenting and linking to some resources, I'll take a look. Keep me updated!
Denis also requested we add outlier rejection to postprocessing. This could be incorporated into an
AverageData
operator, or there could be two separate operators for averaging with/without outlier rejection.Determining outliers
First the trimmed mean and trimmed standard deviation will be calculated at each energy point (see here). For each spectrum the following will be then be calculated:
1 / number of energy points * sum[(trimmed_mean - spectrum / trimmed_stddev)**2]
This value is essentially a measure of how many trimmed standard deviations the spectrum typically deviates from the trimmed mean, and it can be compared to a threshold to determine if a given spectrum is an outlier in the group.A few notes from Denis: