DataMedSci / pymchelper

Python toolkit for SHIELD-HIT12A and FLUKA
http://datamedsci.github.io/pymchelper/
15 stars 7 forks source link

faster MCPL writing using numpy #663

Closed grzanka closed 1 year ago

grzanka commented 1 year ago

It seems for large files we would need to use numpy instead of pure python. For 100 files with 10^7 particles, conversion with numpy to MCPL takes about 2 min, resulting in 3 GB phasespace file. Pure python takes weeks

reviewpad[bot] commented 1 year ago

AI-Generated Summary: This pull request includes reformatting and refactoring operations among several files in the 'pymchelper' project.

For the 'input_output.py' file, a logging message is added to provide status information during file reading. Furthermore, the log levels are changed from 'info' to 'debug' for messages related to page concatenation and averaging.

Regarding the 'binary_spec.py' file, a new enum 'phase_space_population' is added to the 'SHBDOTagID' class. Empty tuples and redundant spaces are cleaned up in this file.

In the 'mcpl.py' file, significant changes are made to the 'write_single_page' function. Instead of using a basic for loop to iterate and encode the particle data, numpy is used to improve efficiency. The particle data arrangement was modified to fit the Monte Carlo Particle List (MCPL) format. More fields are added in the binary structures now holding data related to the particles. The changes ensure improved structure and performance, for large arrays in particular.

In total, these changes improve the readability, efficiency, and robustness of the code.

reviewpad[bot] commented 1 year ago

AI-Generated Summary: This pull request involves changes made across multiple files. In pymchelper/input_output.py, a log information for reading files has been added and the debug messages for concatenating and averaging have been updated. A new class phase_space_population has been added in pymchelper/readers/shieldhit/binary_spec.py and minor cleanups have been performed such as removing unnecessary line breaks and rearranging the code segments. In pymchelper/writers/mcpl.py, the writing process has been enhanced with numpy for handling large arrays and improving performance. The structure of the written data has been adjusted to match the MCPL format. Finally, a new unit test has been added to validate the particle's direction in tests/test_mcpl.py.