fanurs / data-analysis-e15190-e14030

Data analysis for the experiment E15190-E14030 at the National Superconducting Cyclotron Laboratory (NSCL), currently known as the Facility for Rare Isotope Beams (FRIB). This experiment aims to constrain the nuclear equation of state with heavy ion collisions at intermediate energies.
https://fanurs.github.io/data-analysis-e15190-e14030
GNU General Public License v3.0
2 stars 0 forks source link

Pulse shape discrimination is bad for good quality data. #9

Closed fanurs closed 2 years ago

fanurs commented 2 years ago

The bug happens at https://github.com/Fanurs/data-analysis-e15190-e14030/tree/a3478629de353fc2378f04481afa92a5566d9656

Illustration: In the diagnostic plot below, neutron fit is terrible despite very good data quality. This is because at TOTAL_R > 1800, there is very little gamma. But the algorithm still enforces a "two-peak fit" on the y-projection of the vertical slice. This results in a mistake where the neutron peaks are identified as gamma peaks, and the outlier peaks at around (2000, -50) are identified as neutron peaks instead.

image
fanurs commented 2 years ago

Starting from https://github.com/Fanurs/data-analysis-e15190-e14030/commit/15be9c2761e223c40c9698a0ad536f44c8c52ac5, the PSD script has been made more flexible by allowing users to tune the hyperparameters in CLI. By choosing a suitable set of hyperparameters, most bad fits on good data should be resolved; bad data are just bad, and there's nothing we can do about it.

The plot below is how the new algorithm would perform on the same subset of data (even lesser statistics!):

image

Here is how you can reproduce the good fit:

cd $PROJECT_DIR/e15190/neutron_wall/
python pulse_shape_discrimination.py B 4383-4399 -b 12 --ft-breakpoint2 2000
<view-command> $PROJECT_DIR/database/neutron_wall/pulse_shape_discrimination/gallery/run-4383-4399-h2bb70/NWB-bar12.png

The key solution for this particular example is the option --ft-breakpoint2 2000. This replaces the default value of 2500 by 2000. What this does is that the algorithm will only look for neutron peak after TOTAL > 2000, whereas the gamma fast-total relation would simply be extrapolated linearly from gamma data below TOTAL < 2000.

Currently there are some unimportant warnings showing up. You can suppress those warnings by doing python -W ignore pulse_shape_discrimination.py ... instead.

All available options can be found using the -h flag:

python pulse_shape_discrimination.py -h

A few more remarks:

  1. All the hyperparameters are stored in the corresponding JSON file for future reference.
  2. The floating numbers on the top left corner of the plot attempt to quantify the goodness of fit & data. Both scores are also stored in the JSON files.
  3. The black vertical dashed lines define the three regions of two-peak fits. First, TOTAL in the 100-1500 range, the algorithm tries to find both neutron and gamma peaks. Second, 1500-2000, the algorithm still attempts to find both peaks, but with a wider convolution (due to lower statistics). Third, 2000-4000, there is very little gamma, so only one peak is found, which will always be identified as neutron.
  4. The pink vertical dashed line indicates when the neutron fast-total relation transitions from a quadratic function into a linear function, while ensuring continuity and smoothness. This transition point can be varied by the option --x-switch-neutron. The default is 1300.0. Anyway, the fit is not super sensitive to this hyperparameter.