ecmwf / atlas

A library for numerical weather prediction and climate modelling
https://sites.ecmwf.int/docs/atlas
Apache License 2.0
112 stars 42 forks source link

The test "atlas_fctest_field_host" fails with "ENABLE_CUDA" build option #216

Closed l90lpa closed 2 months ago

l90lpa commented 2 months ago

What happened?

When Atlas is built with the CMake option "ENABLE_CUDA" the test "atlas_fctest_field_host" fails at the following check, https://github.com/ecmwf/atlas/blob/51e389e24ad85855f3f61f8cbb1757b1c5777cc3/src/tests/field/fctest_field_host.F90#L52 I believe the cause is that under the build option "ENABLE_CUDA", the DataStore variable device_updated_ is initialized to false here, https://github.com/ecmwf/atlas/blob/51e389e24ad85855f3f61f8cbb1757b1c5777cc3/src/atlas/array/native/NativeDataStore.h#L106 And similar happens for WrappedDataStore. I don't believe there is anything broken with Atlas, only that the test is broken.

For reference the failing test output is:

109: Test command: /home/azureuser/projects/jedi/jedi-bundle/atlas/tools/atlas-run "/home/azureuser/projects/jedi/build-relwithdebinfo/atlas/src/tests/field/atlas_fctest_field_host"
109: Environment variables: 
109:  ATLAS_RUN_NGPUS=1
109:  OMP_NUM_THREADS=1
109: Test timeout computed to be: 1500
109: + export ECKIT_MPI_FORCE=serial
109: + /home/azureuser/projects/jedi/build-relwithdebinfo/atlas/src/tests/field/atlas_fctest_field_host
109: /home/azureuser/projects/jedi/jedi-bundle/atlas/src/tests/field/fctest_field_host.F90:52: warning: FCTEST_CHECK( .not. field%device_needs_update() )
109: STOP 1
1/1 Test #109: atlas_fctest_field_host ..........***Failed    0.48 sec

What are the steps to reproduce the bug?

  1. Build Atlas 0.38.1 with the following CMake options:
    set(ENABLE_MPI ON)
    set(ENABLE_OMP ON)
    set(ENABLE_OMP_Fortran ON)
    set(ENABLE_FCKIT ON)
    set(ENABLE_ECTRANS ON)
    set(ENABLE_TESSELATION ON)
    set(ENABLE_FFTW ON)
    set(ENABLE_CUDA ON)
    set(ENABLE_GRIDTOOLS_STORAGE OFF)
  2. Run the test atlas_fctest_field_host:
    ctest -R atlas_fctest_field_host
  3. Observer failing test.

Version

v0.38.1

Platform (OS and architecture)

Ubuntu 22.04

Relevant log output

No response

Accompanying data

No response

Organisation

JCSDA

l90lpa commented 2 months ago

Hi @fmahebert, @odlomax, and @yaswant, just pinging you so that you're aware.

wdeconinck commented 2 months ago

Thanks @l90lpa for reporting. Indeed it is only the test that was broken. This is now fixed in develop.