OSGeo / grass

GRASS GIS - free and open-source geospatial processing engine
https://grass.osgeo.org
Other
791 stars 288 forks source link

lib/gis: Add a helper function to determine the number of threads for OpenMP #3929

Open cyliang368 opened 1 week ago

cyliang368 commented 1 week ago

Based on the discussion in https://github.com/OSGeo/grass/pull/3917, this PR adds a function to make the parser determine the number of threads, mainly for the standard option G_OPT_M_NPROCS.

  • nprocs > 0 for min(nprocs, max_threads_on_device)
  • nprocs <= 0 for max(max_threads_on_device + nprocs, 1) (e.g., nprocs=0 for max_threads_on_device, nprocs=-1 for max_threads_on_device - 1, and nprocs=-10000 with 24 threads for 1, etc.)
cyliang368 commented 6 days ago

This PR failed on four tests that ran t.rast.what in parallel. Here is a part of its detail:

``` ====================================================================== FAIL: test_row_stdout_where_parallel_cat (__main__.TestRasterWhat) ---------------------------------------------------------------------- Traceback (most recent call last): File "temporal/t.rast.what/testsuite/test_what.py", line 273, in test_row_stdout_where_parallel_cat self.assertLooksLike(text, str(t_rast_what.outputs.stdout)) File "etc/python/grass/gunittest/case.py", line 202, in assertLooksLike self.fail(self._formatMessage(msg, standardMsg)) AssertionError: "cat|x|y|start|end|value 1|[115](https://github.com/OSGeo/grass/actions/runs/9669690114/job/26676756189?pr=3929#step:14:116).0043586274|36.3593955783|2001-04-01 00:00:00|2001-07-01 00:00:00|200 2|79.6816763826|45.2391522853|2001-04-01 00:00:00|2001-07-01 00:00:00|200 3|97.4892579600|79.2347263950|2001-04-01 00:00:00|2001-07-01 00:00:00|200 1|115.0043586274|36.3593955783|2001-07-01 00:00:00|2001-10-01 00:00:00|300 2|79.6816763826|45.2391522853|2001-07-01 00:00:00|2001-10-01 00:00:00|300 3|97.4892579600|79.2347263950|2001-07-01 00:00:00|2001-10-01 00:00:00|300 1|115.0043586274|36.3593955783|2001-10-01 00:00:00|2002-01-01 00:00:00|400 2|79.6816763826|45.2391522853|2001-10-01 00:00:00|2002-01-01 00:00:00|400 3|97.4892579600|79.2347263950|2001-10-01 00:00:00|2002-01-01 00:00:00|400 " does not correspond with "cat|x|y|start|end|value 1|115.0043586274|36.3593955783|2001-04-01 00:00:00|2001-07-01 00:00:00|200 1|115.0043586274|36.3593955783|2001-07-01 00:00:00|2001-10-01 00:00:00|300 1|115.0043586274|36.3593955783|2001-10-01 00:00:00|2002-01-01 00:00:00|400 2|79.6816763826|45.2391522853|2001-04-01 00:00:00|2001-07-01 00:00:00|200 2|79.6816763826|45.2391522853|2001-07-01 00:00:00|2001-10-01 00:00:00|300 2|79.6816763826|45.2391522853|2001-10-01 00:00:00|2002-01-01 00:00:00|400 3|97.4892579600|79.2347263950|2001-04-01 00:00:00|2001-07-01 00:00:00|200 3|97.4892579600|79.2347263950|2001-07-01 00:00:00|2001-10-01 00:00:00|300 3|97.4892579600|79.2347263950|2001-10-01 00:00:00|2002-01-01 00:00:00|400 " ---------------------------------------------------------------------- Ran 17 tests in 38.755s FAILED (failures=4) ======================================================================== FAILED ./temporal/t.rast.what/testsuite/test_what.py (4 tests failed) ```

In the test, the args in SimpleModule still go through the parser I modified in this PR. Although OpenMP is not supported, this Python module (t.rast.what) can still be parallelized by subprocess in Python. Without OpenMP, nprocs=4 is passed to this Python module before this PR, but nprocs is changed to 1 in this PR before it is passed to this Python module. That's why the tests failed.

Original: t.rast.what nprocs=4 -> parser (nothing happens) -> nprocs=4 main function in Python This PR: t.rast.what nprocs=4 -> parser (nprocs is set to 1) -> nprocs=1 main function in Python

https://github.com/OSGeo/grass/blob/093895168e8e67595eceb00a747d837157b0085a/temporal/t.rast.what/testsuite/test_what.py#L275-L301

My questions are:

  1. Should we let Python modules run in parallel by subprocess when OpenMP is not supported?
  2. If we want to do so, how can I avoid the situation I mentioned above?

@HuidaeCho @wenzeslaus @marisn Do you have any suggestions?

petrasovaa commented 5 days ago

Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.

Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765

cyliang368 commented 5 days ago

Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.

I checked and found that every C module parallelized by OpenMP has the keyword "parallel", which I think it should. Thus, I use this keyword to distinguish whether a multithread/multiprocess is spawned by C or Python.

Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765

The parser runs after options are defined, so the function can also handle an environment variable.

petrasovaa commented 5 days ago

Perhaps the function should be called in the particular C tools, that would avoid the problem with Python tools.

I checked and found that every C module parallelized by OpenMP has the keyword "parallel", which I think it should. Thus, I use this keyword to distinguish whether a multithread/multiprocess is spawned by C or Python.

That's probably not a good strategy, the parallel keyword was meant for any parallelization, not just openmp, we would have to have a new keyword, but I am not sure using a keyword is good idea anyway.

Also, there is the NPROCS variable, the function should take into account: https://github.com/OSGeo/grass/blob/main/lib/gis/parser_standard_options.c#L765

The parser runs after options are defined, so the function can also handle an environment variable.

wenzeslaus commented 5 days ago

Perhaps the function should be called in the particular C tools...

That I think is a good approach. Other values require this approach too. RGB colors are one example, but even the current string provided to nprocs needs to be converted to integer.

cyliang368 commented 4 days ago

I left the parser unchanged and just added a helper function instead. A C module can call this function if it is needed.

That's probably not a good strategy, the parallel keyword was meant for any parallelization, not just openmp, we would have to have a new keyword, but I am not sure using a keyword is good idea anyway.

Though this PR does not use the keyword now, I think new keywords should be considered. As you said, parallelizations could be done by different libraries. New keywords to indicate which libraries/methods are used can help with documentation and maintenance.

wenzeslaus commented 3 days ago

New keywords to indicate which libraries/methods are used can help with documentation and maintenance.

Agreed. I think this will work well when we add description for each keyword, an extended version of what we have for topics (topic keywords).

cyliang368 commented 3 days ago

Should the function just return number of threads as an integer?

You are right. I make it return the value now, and it won't change the answer of the nprocs option.

Can you please provide a breakdown how this will behave with the default value for nprocs and the possibility to change that using g.gisenv or settings in GUI? (Previously mentioned by @petrasovaa)

Let's take r.texture as an example.

  1. The option objects are defined here in main.c under r.texture. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/raster/r.texture/main.c#L107-L113

  2. The G_define_standard_option function is called. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/lib/gis/parser_standard_options.c#L139

  3. The environment variable is fetched here by G_getenv_nofatal. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/lib/gis/parser_standard_options.c#L757-L770

  4. Then, the G_parser is called after all option and flag are defined. https://github.com/OSGeo/grass/blob/b5bfcd576541c1b4d43187c4d66ff0024167bab1/raster/r.texture/main.c#L166-L167

I put the helper function in G_parser in the previous commits. Even if the environment variable changes, the parser will get it and handle it. However, we want particular C modules instead of the parser to call it, so it does not matter now. That's my understanding. I hope this is what you need.