Open Falconus opened 1 week ago
Thanks for your report @Falconus. If you only use commands, i.e., set the nprocs parameter in t.rast.aggregate instead of via GUI, does it also fail? Would you mind creating a command line reproducible example with North Carolina dataset?
Using the time series dataset from the Data page:
#Note: set default processors to >1 to test
t.info input=LST_Day_monthly@modis_lst
#Set region to maximum extent of LST_Day_monthly
g.region n=760180.124115 s=-415819.875885 e=1550934.464115 w=-448265.535885 -pa
#Probably not relevant, but I did this anyway
t.rast.series -n input=LST_Day_monthly@modis_lst method=count output=intersection
#Set region to subset
g.region n=550997 s=156914 e=626823 w=-56871
#Export region to polygon
v.in.region output=mask_area
#Reset region back to max extent
g.region n=760180.124115 s=-415819.875885 e=1550934.464115 w=-448265.535885 -pa
#Set mask from polygon
r.mask vector=mask_area@modis_lst
#The following two tasks fail. Note the granularity is set to 2 months, since 1 month doesn't do anything (nothing to aggregate)
t.rast.aggregate input=LST_Day_monthly output=LST_aggr basename=LST_ suffix=time granularity="2 months" method=average
t.rast.aggregate input=LST_Day_monthly output=LST_aggr basename=LST_ suffix=time granularity="2 months" method=average nprocs=1 --overwrite
#Remove mask
r.mask -r
#The following task succeeds without mask
t.rast.aggregate input=LST_Day_monthly output=LST_aggr basename=LST_ suffix=time granularity="2 months" method=average --overwrite
I think this is the same as is #4297. There may be other tools impacted.
Yes, I guess e.g. t.rast.series is affected the same way. But virtually every Python module that uses OpenMP parallelized modules under the hood would be affected.
Ideally, this is fixed in the Python modules, I guess. We could probably create a library function that:
Not sure if case b could be written to find an optimal balance between inner and outer processes.
We use a very simplistic approach for something like this here: https://github.com/NVE/actinia_modules_nve/blob/762f55bac991c1b5424e87d04340d435800c0b0c/src/temporal/t.pytorch.predict/t.pytorch.predict.py#L695
I need to think this through more, but it seems to me there are 2 separate issues, one needs to be fixed in the C tools (#4297) and the other one is how to deal with the nprocs parameter in the python temporal tools (and there is also the environment variable).
Description
When attempting to run the t.rast.aggregate tool with a mask, it fails when the number of threads for parallel computing is set to >1 in the GUI settings (Settings → Preferences). When it is reset to 1 and saved, the tool works as expected with no errors. When the mask is removed, it also works with no errors.
To reproduce
t.rast.aggregate --overwrite input=uas_dsm@assignment5b output=uas_dsm_aggr basename=uas_dsm_aggr suffix=time granularity=1 months nprocs=1
. The nprocs flag had no effect, regardless of whether it was set at 1 or 16.Expected behavior
Tool should not fail due to mask if default nprocs are set to >1.
Screenshots
System description
Workaround
Set the default nprocs to 1 or remove the mask for the t.rast.aggregate step.