Closed rettigl closed 6 months ago
It appears the culprit is scipy. The issue occures for scipy >= 1.10.0, and only if momentum correction is performed. A major change in this version has been to interpolation code, which is used for momentum correction. I will further look into this.
The issue seems to be that the RegularGridInterpolator in >=1.10 does prevent efficient parallelization of dataframe computation: With 1.9.3: With 1.10.0:
https://stackoverflow.com/questions/75427538/regulargridinterpolator-excruciatingly-slow-compared-to-interp2d Related: https://github.com/scipy/scipy/issues/18010 https://github.com/scipy/scipy/issues/17356
It seems like this is a known issue, which has not been solved yet. Simplest solution is to limit version of scipy for now.
Test new interpolation method with scipy.ndimate.map_coordinates: scipy 1.9.3: scipy 1.10.0:
Considering that updating a package can have such a big impact on performance, might be a good idea if we make the pull request bot to also benchmark the binning, with all corrections and calibrations before updating. Not sure how feasible that is.
Considering that updating a package can have such a big impact on performance, might be a good idea if we make the pull request bot to also benchmark the binning, with all corrections and calibrations before updating. Not sure how feasible that is.
I also considered that, however, I am not sure how reproducible results on the github workers will be. I can still try to set something up, if even only for local tests.
Came here from the scipy issue. Would be useful to retry with scipy 1.13rc1 , as RGI should be significantly faster in scipy 1.13.
Describe the bug The binning is really slow with the current package versions in the poetry.lock file
To Reproduce Steps to reproduce the behavior:
Expected behavior Previously, binning took much less time:
Poetry environment:
Reference enviroment: