threeML / hawc_hal

HAWC Accelerated Likelihood - python-only framework for HAWC data analysis
BSD 3-Clause "New" or "Revised" License
11 stars 21 forks source link

Multiprocessing the loading of ROOT files #91

Closed torresramiro350 closed 6 months ago

torresramiro350 commented 8 months ago

Summary of changes:

The reading of ROOT files is now faster by parallelizing the reading of ROOT files with multiprocessing.Pool.

Multiprocessing the reading of ROOT files

This pull request improves the loading time for reading ROOT files with the multiprocessing module. I use the multiprocessing.Pool which generates a process pool that spawns a number of processes which can bypass python's Global Interpreter Lock (GIL). The changes have been implemented info from_root_file.py use this feature and response.py. Loading of the ROOT files greatly improves especially for energy estimators which used to take around 5-7 minutes.

How to enable this:

The changes are enabled by providing a n_workers parameter to HAL. For example,

hawc_like = HAL("HAWC", maptree, response, roi, n_workers=2) The number of workers should not exceed the number of available cores.

Further changes:

Major refactoring of the code for both map tree and response file modules. Some of the changes are compatible with python 3.10+ so it will require updating to a newer environment. threeML is yet not ready for python 3.11, but the current changes work as of python 3.10.13.

Feel free to test the changes and let me know if there are any issues.