impy-project / chromo

Hadronic Interaction Model interface in PYthon
Other
30 stars 7 forks source link

Replace RMMARD with NumPy RNG #115

Closed HDembinski closed 1 year ago

HDembinski commented 1 year ago

Closes #102 Closes #104

Any Numpy PRNG can be used (not only PCG64), currently it is the default one from Numpy. There should be no need to use any other. Seeds are handled completely in Python by the base class MCRun, Fortran codes should not set seeds. The Fortran functions which do any kind of RPNG initialization have been replaced with noop stubs.

I performed some crude benchmarks to see whether this change impacts the speed of the generators. SIBYLL-2.1 is the fastest, it got maybe 5 % slower. EPOS is one of the slowest models, it got 13 % faster.

Benchmark: Runtime to generate 10000 pp events at 1 TeV and write as hepmc3.

EPOS-LHC Sibyll-2.1
main 51 s 3.1 s
this branch 45 s 3.3 s

test_rng_state fails for EPOS and UrQMD. EPOS used to pass. We can fix this in a separate PR. I did some investigation but could find out why this is happening.

gasdev and spgasdev (used by the Sibylls) are now implemented in terms of the numpy random normal generator. This allows us to get rid of the extra state which the old implementations held. If we want reproducible randomness, we need to get rid of random functions which cache some results.

Other changes

HDembinski commented 1 year ago

The failure on Windows is caused by a single trip in test_generators.py. DpmjetIII306-He-air-cms2ft Perhaps regenerating the reference fixes the issue, but I opt for deactivating test_generators on Windows altogether. It is enough to test this on MacOS and Linux, and the Windows run generally takes much longer than the others. Skipping test_generators on Windows should fix that.

HDembinski commented 1 year ago

I understand the issue with EPOS now. EPOS computes some cached numbers when it generates the first event. If you save the state of the RNG before the first event is created, then generate some events and restore the state, the next event is not going to the be same again, since this time the cache was not filled.

We can work around this particular issue by generating a dummy event at the end initialization, but if EPOS uses this technique in other places (it seems so), then it is going to be impossible to support RNG state saving for EPOS.

HDembinski commented 1 year ago

There is one issue remaining, namely the model.cfg broke parallel builds, at least on Windows. I guess this is by design. If you have many cores and no mold, building 2 models takes the same time as building the entire package. Could that be made optional?

I don't know how to fix this or to make it optional. It should not make a difference for cmake whether we ask it to build the entire directory or only specific targets. Can you give ninja on Windows a try, perhaps it works better.

I think the benefit of the new system outweighs the cons. Support for Windows is anyway broken and putting a lot of effort into this seems like wasted time, since I don't think we have actual users on Windows.

HDembinski commented 1 year ago

@afedynitch The windows runner needs 17 min to compile, the linux runner needs 16 min. It does not look like building on Windows is slower?

afedynitch commented 1 year ago

@afedynitch The windows runner needs 17 min to compile, the linux runner needs 16 min. It does not look like building on Windows is slower?

It doesn't make a big difference because the CI builds on 2 threads not 64. I'll try Ninja, hope it works with Mingw.

afedynitch commented 1 year ago

Ninja solves the problem. Builds in 82 seconds from scratch.