Closed wdeconinck closed 1 year ago
Please review @odlomax @ytremolet
Patch coverage: 88.80
% and project coverage change: +0.04
:tada:
Comparison is base (
9a2def8
) 78.18% compared to head (a61d667
) 78.22%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
I am getting build failures when building against oops (JCSDA) with this:
In file included from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/ArrayView.h:18:0,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/functionspace/CubedSphereColumns.h:13,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/functionspace.h:16,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/base/GeometryData.h:15,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/base/Geometry.h:23,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/ControlVariable.h:20,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/ControlIncrement.h:21,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/MinimizerUtils.h:16,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/MinimizerUtils.cc:8:
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: error: 'is_convertible_v' is not a member of 'std'
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp,value_type>>>
^~~~
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: note: suggested alternative: 'is_convertible'
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp,value_type>>>
^~~~
is_convertible
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: error: 'is_convertible_v' is not a member of 'std'
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: note: suggested alternative: 'is_convertible'
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp,value_type>>>
^~~~
is_convertible
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:102: error: template argument 1 is invalid
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp,value_type>>>
^
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:105: error: expected unqualified-id before '>' token
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp,value_type>>>
^
gmake[2]: [oops/src/CMakeFiles/oops.dir/oops/generic/UnstructuredInterpolator.cc.o] Error 1
gmake[2]: [oops/src/CMakeFiles/oops.dir/oops/assimilation/MinimizerUtils.cc.o] Error 1
@DJDavies2 Thanks for verifying compilation. You have observed this in multiple PRs now, so could you see if it is triggered in develop branch as well? If so create a GitHub issue specifically for the develop branch. Please then mention the compiler, version, and compile line including include directories and flags.
I have added to develop branch the public propagation of the -std=c++17 flag to downstream projects, as I believe that's what is missing. A reintegration of this branch on develop should then hopefully fix this.
@ytremolet when you have time could you comment? Otherwise I'll just merge this in soon.
@DJDavies2 Thanks for verifying compilation. You have observed this in multiple PRs now, so could you see if it is triggered in develop branch as well? If so create a GitHub issue specifically for the develop branch. Please then mention the compiler, version, and compile line including include directories and flags.
I have added to develop branch the public propagation of the -std=c++17 flag to downstream projects, as I believe that's what is missing. A reintegration of this branch on develop should then hopefully fix this.
I have tested against head of develop and it seems to work now, there are been changes updating to std=c++17 in some downstream repositories so that has probably fixed this issue.
I have added performance tests with OMP_NUM_THREADS=4 with following output:
timing with for loops seq = 0.00333232 (triple nested for-loop, sequential)
timing with execution::seq = 0.0790008 (field::for_each_value, sequential)
timing with for loops par = 0.00102567 (triple nested for-loop with OpenMP for for outermost loop)
timing with execution::par = 0.0215601 (field::for_each_value, parallel)
The difference between normal for-loops and the field::for_each is pretty dramatic (field::for_each is factor 20-24 slower)! I don't know why with the ArrayForEach tests we had no difference in result, but we do have a lot of difference here.
This should foremost not stop using this capability. The benefit is that optimisations can and WILL be applied in following developments.
The new test fails when I run it:
Running case 2: test field::for_each_value_masked; horizontal_dimension {0,2} mask_rank1 ...
[0] [31mTest "test field::for_each_value_masked; horizontal_dimension {0,2} mask_rank1" failed with unhandled eckit::Exception: Assertion failed: (std::is_invocable_v<Mask,MaskArgs...>) in apply, line 265 of /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/helpers/ArrayForEach.h @ [0m
[0] Stack trace: backtrace [1] stack has 38 addresses
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::BackTrace::dump[abi:cxx11]())0x91
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::Exception::Exception())0xb6
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::AssertionFailed::AssertionFailed(std::cxx11::basic_string<char, std::char_traits
and similar for rank2.
@DJDavies2 Is this with gnu 7.3? Which build-type? (You could print output of atlas --info
)
This failure should not have anything to do with the new test. Not sure how this occurs. It seems to pass on so many other compilers and build-types, including gnu 7.5.
@odlomax could you perhaps try to investigate or debug this on this particular configuration?
Yes, this is gnu 7.3.
@DJDavies2 Is this with gnu 7.3? Which build-type? (You could print output of
atlas --info
)This failure should not have anything to do with the new test. Not sure how this occurs. It seems to pass on so many other compilers and build-types, including gnu 7.5.
@odlomax could you perhaps try to investigate or debug this on this particular configuration?
I'm just having a look. Will get back to you later today.
@DJDavies2 Is this with gnu 7.3? Which build-type? (You could print output of
atlas --info
) This failure should not have anything to do with the new test. Not sure how this occurs. It seems to pass on so many other compilers and build-types, including gnu 7.5. @odlomax could you perhaps try to investigate or debug this on this particular configuration?I'm just having a look. Will get back to you later today.
I might have found a bug. If I can confirm it, shall I PR a solution into this branch?
I think I've fixed it https://github.com/ecmwf/atlas/pull/145
tl;dr: GCC 7.3 is old and rubbish.
With new changes from d97f973 the field::for_each_value timings did not seem to improve:
timing with for loops seq = 0.0035155
timing with for loops par = 0.00106079
timing with execution::seq = 0.092391
timing with execution::par = 0.0240693
With new changes from d97f973 the field::for_each_value timings did not seem to improve:
timing with for loops seq = 0.0035155 timing with for loops par = 0.00106079 timing with execution::seq = 0.092391 timing with execution::par = 0.0240693
Strange. I got the following when I ran the test in release:
37: timing with for loops seq = 0.000203149
37: timing with for loops par = 5.73367e-05
37: timing with execution::seq = 0.000196719
37: timing with execution::par = 5.94674e-05
Admittedly, there's a fair bit of variation between runs.
Apologies @odlomax I indeed do recover the raw for loop performance when I have optimisations on. I had hacked my compile flags in my build and forgot to reset them.
@DJDavies2 Please let me know if you still have issues with updated branch, and thanks once more for spotting this!
The test works now, thanks.
Great success!
Finally merged. Thanks all your help!
Develop field::for_each functionality. For instance:
We can also loop over
Field::horizontal_dimension()
only, hence extracting column-views. Note thatField::horizontal_dimension()
returns astd::vector<idx_t>
to cater for multiple horizontal dimensions in a field which don't need to be following eachother (e.g. BlockStructuredColumns fields of dimensions{nblock,nlev,nproma}
wherenblock
andnproma
are the horizontal dimensions).In case there are extra dimensions (e.g. a "variable" dimension), then the column views could be multidimensional:
There are further functions that accept a masking function or masking field (such as a ghost field). The ghost field is typically a one-dimensional field corresponding to the horizontal_dimension. If there are multiple horizontal_dimensions, then either the ghost field can be multi-dimensional, or a one-dimensional view of the horizontal_dimensions with indexing like (ni*j+i).
If a masking function is supplied instead of a ghost field, and only the first field dimension is to be masked, then define it as: