Open cyliang368 opened 1 week ago
This PR failed on four tests that ran t.rast.what
in parallel. Here is a part of its detail:
In the test, the args in SimpleModule still go through the parser I modified in this PR. Although OpenMP is not supported, this Python module (t.rast.what
) can still be parallelized by subprocess
in Python. Without OpenMP, nprocs=4
is passed to this Python module before this PR, but nprocs
is changed to 1
in this PR before it is passed to this Python module. That's why the tests failed.
Original: t.rast.what nprocs=4
-> parser (nothing happens) -> nprocs=4
main function in Python
This PR: t.rast.what nprocs=4
-> parser (nprocs
is set to 1
) -> nprocs=1
main function in Python
My questions are:
subprocess
when OpenMP is not supported?@HuidaeCho @wenzeslaus @marisn Do you have any suggestions?
Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.
Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765
Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.
I checked and found that every C module parallelized by OpenMP has the keyword "parallel"
, which I think it should. Thus, I use this keyword to distinguish whether a multithread/multiprocess is spawned by C or Python.
Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765
The parser runs after options are defined, so the function can also handle an environment variable.
Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.
I checked and found that every C module parallelized by OpenMP has the keyword
"parallel"
, which I think it should. Thus, I use this keyword to distinguish whether a multithread/multiprocess is spawned by C or Python.
That's probably not a good strategy, the parallel keyword was meant for any parallelization, not just openmp, we would have to have a new keyword, but I am not sure using a keyword is good idea anyway.
Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765
The parser runs after options are defined, so the function can also handle an environment variable.
Perhaps the function should be called in the particular C tools...
That I think is a good approach. Other values require this approach too. RGB colors are one example, but even the current string provided to nprocs needs to be converted to integer.
I left the parser unchanged and just added a helper function instead. A C module can call this function if it is needed.
That's probably not a good strategy, the parallel keyword was meant for any parallelization, not just openmp, we would have to have a new keyword, but I am not sure using a keyword is good idea anyway.
Though this PR does not use the keyword now, I think new keywords should be considered. As you said, parallelizations could be done by different libraries. New keywords to indicate which libraries/methods are used can help with documentation and maintenance.
New keywords to indicate which libraries/methods are used can help with documentation and maintenance.
Agreed. I think this will work well when we add description for each keyword, an extended version of what we have for topics (topic keywords).
Should the function just return number of threads as an integer?
You are right. I make it return the value now, and it won't change the answer of the nprocs option.
Can you please provide a breakdown how this will behave with the default value for nprocs and the possibility to change that using g.gisenv or settings in GUI? (Previously mentioned by @petrasovaa)
Let's take r.texture
as an example.
The option
objects are defined here in main.c
under r.texture
.
https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/raster/r.texture/main.c#L107-L113
The G_define_standard_option
function is called.
https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/lib/gis/parser_standard_options.c#L139
The environment variable is fetched here by G_getenv_nofatal
.
https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/lib/gis/parser_standard_options.c#L757-L770
Then, the G_parser
is called after all option
and flag
are defined.
https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/raster/r.texture/main.c#L166-L167
I put the helper function in G_parser
in the previous commits. Even if the environment variable changes, the parser will get it and handle it. However, we want particular C modules instead of the parser to call it, so it does not matter now. That's my understanding. I hope this is what you need.
Based on the discussion in https://github.com/OSGeo/grass/pull/3917, this PR adds a function to
make the parserdetermine the number of threads, mainly for the standard optionG_OPT_M_NPROCS
.