GeoStat-Framework / GSTools

GSTools - A geostatistical toolbox: random fields, variogram estimation, covariance models, kriging and much more
https://geostat-framework.org
GNU Lesser General Public License v3.0
544 stars 71 forks source link

Add a global variable to set number of parallel threads #336

Closed LSchueler closed 7 months ago

LSchueler commented 8 months ago

As pointed out by pavlovc2 in discussion #333, it would be helpful to set the number of parallel threads in GSTools directly and not only rely on environment variables.

I didn't have a lot of time to put this together, but in this first draft, you can set the number of threads with the global variable config.NUM_THREADS. At the moment, this only works with the Cython code and not with the Rust code. And also, for now, the default value of threads is 1.

TODO

LSchueler commented 8 months ago

The Rust package GSTools-core is now ready for setting the number of parallel threads and I'm quite happy with the implementation.

For Cython, I used a bit of an ugly function to set defaults, without having to use Python variables.

I think the only thing left would be to update the changelog, if the review gets through ;-)

MuellerSeb commented 8 months ago

See https://github.com/GeoStat-Framework/GSTools/issues/337 for rtd issue.

MuellerSeb commented 8 months ago

This looks promising. Was also winding my head around the fact, that prange uses None as default value.. this is idiotic to use a python type here. We should create an issue, that prange takes 0 or -1 to mimic the None behavior. The function you used is ugly but I see that it's the best option at the moment.

Is it testable how many cores are used during a function call? This could be cool to check if the setting is actually working.

LSchueler commented 8 months ago

I mean, we could get the numbers of cores in config.py and set it there and use num_threads=1 as the default values in the Cython and Rust functions. But it feels cleaner to get and set the default values in the Cython and Rust functions.

Regarding the testing, I think it's difficult to accurately test the number of threads used, if we don's simply want to check for the num_threads variable we just set. I checked on my machine and the compute times decrease nearly linearly with the number of threads used (for num_threads < 13 at least). So, locally it works. But I'm not sure how many threads we can use in Github actions and if we really want to check the run times of different function calls.

MuellerSeb commented 7 months ago

Created an issue in cython for prange: https://github.com/cython/cython/issues/5952