More complex simulation setups

List of datasets to be produced:

[x] detector with material

This needs a material map file odd-material-map.root. To activate this, one needs to specify the material input file in the Fatras simulation, by adding the following program options:"

--mat-input-type=file --mat-input-file=odd-material-map.root"

[x] with non-homogenous magnetic field

This needs a magnetic field map odd-bfield.root. To use this, one needs to add the following commands to the Fatras configuration:

  --bf-map-file=odd-bfield.root

[x] with material and non-homogenous magnetic field

These two options can also be done together of course (which presents the worst case).

The files are part of a PR into the ODD detector (not merged yet):

https://github.com/acts-project/OpenDataDetector/pulls

Accessing them needs git large file system (git-lfs) support.

As we hadn't talked about particle type, we should do that as well:

muons (PDG 13)
electrons (PDG 11)
pions (PDG 211)

Some actions:

[x] Investigate those with constant field, but under influence of material
- [x] how many hits are produced (truth) by those particles
- [x] what's the efficiency for these particles

If we want to see the effects of magnetic field / material on muons themselves:

[x] this is visible in the low momentum regime (0.5 - 1 GeV)

After creating the new datasets, I tested the Hough Transform and again, I ended up with high efficiency in all the simulations. So I decided to plot "count vs number-of-hits-for-1-event" for electrons, muons and pions respectively. This is what I got:

pdg-11-hits-counts

pdg-13-hits-counts

pdg-211-hits-counts

The distribution looks very similar. Only for pdg = 13, a bit more hits are concentrated in the center. But this is barely noticeable. The commands I used to produce the datasets like these:

@REM ---------------------------------------------------   pdg-11   ---------------------------------------------------

@REM First Dataset: Ideal dataset
../acts_build/bin/ActsExampleParticleGun -n 100 --gen-pdg 11 --gen-nparticles 25 --gen-mom-gev 0.5:10. --gen-mom-transverse true --gen-eta -0.5:0.5 --output-csv
../acts_build/bin/ActsExampleFatrasDD4hep --gen-pdg 11 --dd4hep-input=../acts/thirdparty/OpenDataDetector/xml/OpenDataDetector.xml --output-csv --bf-constant-tesla 0:0:2 --input-dir="./"
mv ./* ../../../Desktop/CERN/tra-tra/data/pdg11/pdg11-n25-0.5to10GeV-0.5eta/

@REM Second Dataset: With Material Effects (and constant B)
../acts_build/bin/ActsExampleParticleGun -n 100 --gen-pdg 11 --gen-nparticles 25 --gen-mom-gev 0.5:10. --gen-mom-transverse true --gen-eta -0.5:0.5 --output-csv
../acts_build/bin/ActsExampleFatrasDD4hep --gen-pdg 11 --dd4hep-input=../acts/thirdparty/OpenDataDetector/xml/OpenDataDetector.xml --output-csv --mat-input-type=file --mat-input-file=../../../Desktop/CERN/other-repos/OpenDataDetector/data/odd-material-map.root --bf-constant-tesla 0:0:2 --input-dir="./"
mv ./* ../../../Desktop/CERN/tra-tra/data/pdg11/pdg11-n25-0.5to10GeV-0.5eta-with-material-effects/

@REM Third Dataset: With Non Homogenous B (without Material Effects)
../acts_build/bin/ActsExampleParticleGun -n 100 --gen-pdg 11 --gen-nparticles 25 --gen-mom-gev 0.5:10. --gen-mom-transverse true --gen-eta -0.5:0.5 --output-csv
../acts_build/bin/ActsExampleFatrasDD4hep --gen-pdg 11 --dd4hep-input=../acts/thirdparty/OpenDataDetector/xml/OpenDataDetector.xml --output-csv --bf-map-file=../../../Desktop/CERN/other-repos/OpenDataDetector/data/odd-bfield.root --input-dir="./"
mv ./* ../../../Desktop/CERN/tra-tra/data/pdg11/pdg11-n25-0.5to10GeV-0.5eta-non-homogenous-magnetic-field/

@REM Fourth Dataset: With Material Effects and Non Homogenous B
../acts_build/bin/ActsExampleParticleGun -n 100 --gen-pdg 11 --gen-nparticles 25 --gen-mom-gev 0.5:10. --gen-mom-transverse true --gen-eta -0.5:0.5 --output-csv
../acts_build/bin/ActsExampleFatrasDD4hep --gen-pdg 11 --dd4hep-input=../acts/thirdparty/OpenDataDetector/xml/OpenDataDetector.xml --output-csv --mat-input-type=file --mat-input-file=../../../Desktop/CERN/other-repos/OpenDataDetector/data/odd-material-map.root --bf-map-file=../../../Desktop/CERN/other-repos/OpenDataDetector/data/odd-bfield.root --input-dir="./"
mv ./* ../../../Desktop/CERN/tra-tra/data/pdg11/pdg11-n25-0.5to10GeV-0.5eta-with-material-effects-non-homogenous-magnetic-field/

@ REM -----------------------------------------------------------------------------------------------------------------

If I'm correct, the behavior above is not expected. I will take again a closer look tomorrow.

Finished the baseline for duplicate removal. For this, I used a "deterministic" function that decides if two reco tracks are referring to the same particle. In pseudocode, it looks like this:
```
if tracks_are_close(track1, track2, thresholds) and common_hit_percentage(hits1, hits2) > threshold:
# tracks are duplicate, do stuff
```
Using this, I developed two approaches:
- The first one stores the reco track with the highest number of hits for small regions inside the binned space. Then this track gets compared with every other reco track inside that space. If they are duplicates, then one is discarded and the other one (the one with the highest hits in it) is kept as the remaining track.
- The second is the same idea as before, but instead of keeping the "best reco track", all the duplicate tracks are stored and compared with any other reco tracks in the same regions, thus allowing for more variation in their content.
To assess the performance of the algorithms, I tested them on the "ideal simulation" for pions (pdg = 211) with non-optimal hyperparameters. The normal algorithm gives:

The first duplicate-remove algorithm yields:

And the second:

We notice that the first one kept the efficiency high (lost only 0.08) and managed to drop the duplicate rate by 0.7. Also the Fake Rate dropped quite a bit. The second algorithm managed to drop the duplicate rate almost to 0 (0.06 to be more accurate, a 0.82 decrease) but at a cost of losing also a lot of efficiency (went to 0.6, lost 0.28). I think for baselines, they are good.
Wrote a bash script that automates the task of creating the desired datasets (for every simulation type). You can find it here, once the merging has been done. You can also use it if you want, just make sure to change the paths so that they are correct for your machine.

Created the Dataset using Pythia8. This is with npileup = 0 (similar results occur for npileup = 200). Removed the particles that have initial p_T smaller than 500 MeV. The xy view of the remaining particles is:

pt8-xy

Running the algorithms in this dataset yields the following metrics (tried for many events, results are similar)

pt8-metrics

The natural question arises: Which particles were not reconstructed? Let's take a look:

pt8-pids-found

I will have to take a look at which particles those pdg values refer to, in order to get a better understanding of the situation.

asalzburger / sms2021-tra-tra

More complex simulation setups #20