acts-project / acts

Experiment-independent toolkit for (charged) particle track reconstruction in (high energy) physics experiments implemented in modern C++
https://acts.readthedocs.io
Mozilla Public License 2.0
104 stars 167 forks source link

Digitization is not thread-safe / reproducable #869

Closed paulgessinger closed 3 years ago

paulgessinger commented 3 years ago

Working on #845 I noticed that the seeding algorithm produces a varying number of seeds from a fixed number of space points if run with multiple threads.

Single threaded, I see

14:00:45    SeedingAlgor   DEBUG     Created 30 track seeds from 40 space points
14:00:45    SeedingAlgor   DEBUG     Created 34 track seeds from 41 space points

every single time.

With 2 threads already I get:

14:01:22    SeedingAlgor   DEBUG     Created 30 track seeds from 40 space points
14:01:22    SeedingAlgor   DEBUG     Created 34 track seeds from 41 space points
---
14:01:23    SeedingAlgor   DEBUG     Created 32 track seeds from 40 space points
14:01:23    SeedingAlgor   DEBUG     Created 32 track seeds from 41 space points
---
14:01:25    SeedingAlgor   DEBUG     Created 34 track seeds from 41 space points
14:01:25    SeedingAlgor   DEBUG     Created 34 track seeds from 40 space points
---
14:01:27    SeedingAlgor   DEBUG     Created 30 track seeds from 40 space points
14:01:27    SeedingAlgor   DEBUG     Created 30 track seeds from 41 space points
---
14:01:28    SeedingAlgor   DEBUG     Created 33 track seeds from 40 space points
14:01:28    SeedingAlgor   DEBUG     Created 33 track seeds from 41 space points

Are we aware of this, and if so is this expected / understood?

paulgessinger commented 3 years ago

@asalzburger @robertlangenberg @XiaocongAi

paulgessinger commented 3 years ago

Steps to reproduce on main (b14a419):

cd to your build directory, I'm assuming it's <repo_root>/build

EDIT: The particle gun might actually not be needed.

bin/ActsExampleParticleGun \
    --events=100 \
    --output-dir=data/gen/four_muons \
    --output-csv \
    --gen-phi-degree=0:90 \
    --gen-eta=-2:2 \
    --gen-mom-gev=1:5 \
    --gen-pdg=13 \
    --gen-randomize-charge \
    --gen-nparticles=4
bin/ActsExampleFatrasGeneric \
    --output-dir=data/sim_generic/single_muon \
    --output-csv \
    --events=100 \
    --bf-constant-tesla=0:0:2
bin/ActsExampleSeedingGeneric \
     --input-dir=data/sim_generic/single_muon \
     --output-dir=output_generic_single_muon \
     --bf-constant-tesla=0:0:2 \
     --digi-smear \
     --digi-config-file ../Examples/Algorithms/Digitization/share/default-smearing-config-generic.json \
     --geo-selection-config-file ../Examples/Algorithms/TrackFinding/share/geoSelection-genericDetector.json \
     -n2 \
     -j2

To see it, you'll likely have to also pass -l0 for verbose logging. If I run the above example with |grep Created I get this:

[pagessin@acts-dev-rd-et /s/p/a/build] (main *)$ bin/ActsExampleSeedingGeneric      --input-dir=data/sim_generic/single_muon/      --output-dir=output_generic_ttbar_pu200      --bf-constant-tesla=0:0:2      --digi-smear      --digi-config-file ../Examples/Algorithms/Digitization/share/default-smearing-config-generic.json      --geo-selection-config-file ../Examples/Algorithms/TrackFinding/share/geoSelection-genericDetector.json -n 2 -j2|grep Created
11:40:42    SeedingAlgor   DEBUG     Created 5 track seeds from 5 space points
11:40:42    SeedingAlgor   DEBUG     Created 2 track seeds from 4 space points
[pagessin@acts-dev-rd-et /s/p/a/build] (main *)$ bin/ActsExampleSeedingGeneric      --input-dir=data/sim_generic/single_muon/      --output-dir=output_generic_ttbar_pu200      --bf-constant-tesla=0:0:2      --digi-smear      --digi-config-file ../Examples/Algorithms/Digitization/share/default-smearing-config-generic.json      --geo-selection-config-file ../Examples/Algorithms/TrackFinding/share/geoSelection-genericDetector.json -n 2 -j2|grep Created
11:40:44    SeedingAlgor   DEBUG     Created 4 track seeds from 5 space points
11:40:44    SeedingAlgor   DEBUG     Created 2 track seeds from 4 space points
[pagessin@acts-dev-rd-et /s/p/a/build] (main *)$ bin/ActsExampleSeedingGeneric      --input-dir=data/sim_generic/single_muon/      --output-dir=output_generic_ttbar_pu200      --bf-constant-tesla=0:0:2      --digi-smear      --digi-config-file ../Examples/Algorithms/Digitization/share/default-smearing-config-generic.json      --geo-selection-config-file ../Examples/Algorithms/TrackFinding/share/geoSelection-genericDetector.json -n 2 -j2|grep Created
11:40:48    SeedingAlgor   DEBUG     Created 4 track seeds from 5 space points
11:40:48    SeedingAlgor   DEBUG     Created 2 track seeds from 4 space points
[pagessin@acts-dev-rd-et /s/p/a/build] (main *)$ bin/ActsExampleSeedingGeneric      --input-dir=data/sim_generic/single_muon/      --output-dir=output_generic_ttbar_pu200      --bf-constant-tesla=0:0:2      --digi-smear      --digi-config-file ../Examples/Algorithms/Digitization/share/default-smearing-config-generic.json      --geo-selection-config-file ../Examples/Algorithms/TrackFinding/share/geoSelection-genericDetector.json -n 2 -j2|grep Created
11:40:51    SeedingAlgor   DEBUG     Created 2 track seeds from 4 space points
11:40:51    SeedingAlgor   DEBUG     Created 5 track seeds from 5 space points

i.e. sometimes it creates 4 track seeds from 5 SPs, and sometimes it's 5 track seeds. With -j1 I have not once seen it created 5 seeds.

paulgessinger commented 3 years ago

After some digging, it seems like it's actually the digitization that is doing something weird here. The spacepoints hitting the SeedingAlgorithm are slightly different sometimes, and this seems to result in the different number of seeds.

e.g.:

15:56:58    SpacePointMa   DEBUG     Created 4 space points
15:56:58    SpacePointMa   DEBUG     Created 5 space points
15:56:58    SeedingAlgor   VERBOSE   SP: -30.2041,-8.87064,3.06696
15:56:58    SeedingAlgor   VERBOSE   SP: -29.8558,10.4661,14.7205
15:56:58    SeedingAlgor   VERBOSE   SP: -68.8145,-20.4961,7.12947
15:56:58    SeedingAlgor   VERBOSE   SP: -30.8753,10.8855,15.1736
15:56:58    SeedingAlgor   VERBOSE   SP: -110.913,-33.5827,11.5894
15:56:58    SeedingAlgor   VERBOSE   SP: -68.1995,23.7873,33.6337
15:56:58    SeedingAlgor   VERBOSE   SP: -163.755,-50.4888,17.012
15:56:58    SeedingAlgor   VERBOSE   SP: -109.898,38.172,54.0709
15:56:58    SeedingAlgor   VERBOSE   SP: -162.206,56.0011,79.8112
15:56:58    SeedingAlgor   DEBUG     Created 2 track seeds from 4 space points
15:56:58    SeedingAlgor   DEBUG     Created 3 track seeds from 5 space points
---
15:57:00    SpacePointMa   DEBUG     Created 4 space points
15:57:00    SpacePointMa   DEBUG     Created 5 space points
15:57:00    SeedingAlgor   VERBOSE   SP: -30.2041,-8.87064,3.04369
15:57:00    SeedingAlgor   VERBOSE   SP: -29.8558,10.4661,14.7205
15:57:00    SeedingAlgor   VERBOSE   SP: -68.8266,-20.47,7.09174
15:57:00    SeedingAlgor   VERBOSE   SP: -30.8772,10.8512,15.2692
15:57:00    SeedingAlgor   VERBOSE   SP: -110.913,-33.5827,11.4758
15:57:00    SeedingAlgor   VERBOSE   SP: -68.198,23.7968,33.6337
15:57:00    SeedingAlgor   VERBOSE   SP: -163.672,-50.6722,17.0786
15:57:00    SeedingAlgor   VERBOSE   SP: -109.881,38.2764,53.9727
15:57:00    SeedingAlgor   VERBOSE   SP: -162.226,55.9094,79.8112
15:57:00    SeedingAlgor   DEBUG     Created 3 track seeds from 5 space points
15:57:00    SeedingAlgor   DEBUG     Created 2 track seeds from 4 space points

This is with a specified random number seed, so I would think the digitization and hit smearing should be deterministic, shouldn't it?

@asalzburger do you have an idea maybe?

robertlangenberg commented 3 years ago

Just to specify: the space points were differently smeared when running multi threaded, with a single thread they are stable - on ONE machine. Paul had different smearing on the thread ripper/centOS than I had on macOS even in single threaded mode.