Open griembauer opened 2 years ago
I agree that is a problem, which is partially the reason I usually just use standard Python multiprocessing.Pool methods (like map_async
) with run_command
. Just curious, do you prefer ParallelModuleQueue for some specific reason?
No, not at all, I am just used to using it since it is the pygrass way ;)
Also, some GRASS modules from the temporal
framework use ParallelModuleQueue, e.g. for aggregation:
https://github.com/OSGeo/grass/blob/1961472afeb7633c9b744b0a60c923fb9b1d4411/python/grass/temporal/aggregation.py#L267
The option to run GRASS modules in parallel (in python) is implemented via the ParallelModuleQueue class. The standard way (?) is to define a processing queue via an
nprocs
parameter, add GRASS modules to be executed in parallel via theput()
method and finally start the parallel processing using thewait()
method. The way it is implemented now, the queue seems to run a number of processes defined bynprocs
and waits for all processes to finish before starting the next "block" of processes. This means that the longest process determines the duration of an entire processing "block". Ideally, free slots could be filled directly with pending processes from the queue instead.