ComputationalRadiationPhysics / picongpu

Performance-Portable Particle-in-Cell Simulations for the Exascale Era :sparkles:
https://picongpu.readthedocs.io
Other
705 stars 218 forks source link

ADIOS error #4552

Closed cbontoiu closed 1 year ago

cbontoiu commented 1 year ago

I just installed picongpu-dev and all works (compilation + runtime) but when trying to read the output I obtain this error

[AbstractIOHandlerImpl] IO Task OPEN_FILE failed with exception. Clearing IO queue and passing on the exception.
RuntimeError                              Traceback (most recent call last)
Cell In[8], line 11
      9 exec(open(sPeff + name + "/particles").read(), globals());
     10 exec(open(sPeff + name + "/densityQ").read(), globals());
---> 11 ts_fieldE    = OpenPMDTimeSeries(DIR_fieldE    + "/simOutput/openPMD/");  
     12 ts_particles = OpenPMDTimeSeries(DIR_particles + "/simOutput/openPMD/");
     13 ts_densityQ = OpenPMDTimeSeries(DIR_densityQ  + "/simOutput/openPMD/");
File ~/anaconda3/lib/python3.10/site-packages/openpmd_viewer/openpmd_timeseries/main.py:73, in OpenPMDTimeSeries.__init__(self, path_to_dir, check_all_files, backend)
     70 self.data_reader = DataReader(backend)
     72 # Extract the iterations available in this timeseries
---> 73 self.iterations = self.data_reader.list_iterations(path_to_dir)
     75 # Check that there are files in this directory
     76 if len(self.iterations) == 0:
File ~/anaconda3/lib/python3.10/site-packages/openpmd_viewer/openpmd_timeseries/data_reader/data_reader.py:116, in DataReader.list_iterations(self, path_to_dir)
    113         file_path = re.sub(r'(\d+)(\.(?!\d).+$)', r'%T\2', first_file_name)
    114         series_name = os.path.join( path_to_dir, file_path)
--> 116     self.series = io.Series(
    117         series_name,
    118         io.Access.read_only )
    119     iterations = np.array( self.series.iterations )
    121 return iterations

RuntimeError: FATAL CODING ERROR: ADIOS Index file /media/quasar/RawDataDisk2/OUT_LASER-EFFDENS-CNT-3D/LASER-EFFDENS-CNT-3D_XYZ[um]_5.0_20.0_5.0_WL[nm]_300_I[Wcm-2]_1.0e+21_A0_8.1_Dt[fs]_3.00_w0[um]_0.6_f0[um]_2.5_ed_CONST_01_a_fieldE/simOutput/openPMD/simData_000000.bp is assumed to always contain n*64 byte-length records. The file size now is 162 bytes.

here is the input model: input.zip

cbontoiu commented 1 year ago

I can add that for post processing I just installed the new release of openPMD-viewer and this is my configuration now

openpmd-api               0.15.1                   pypi_0    pypi
openpmd-viewer            1.7.0                    pypi_0    pypi

while PIConGPU runs with with the latest dev version of openPMD-api from here https://github.com/openPMD/openPMD-api

I am not sure if the issue happens when producing data or when reading it. My precsion is 32 bit in precision.param.

PrometheusPi commented 1 year ago

@cbontoiu thanks for the detailed report. Before investigating all your input files, could you please provide:

cbontoiu commented 1 year ago

Hi @PrometheusPi thank you for your reply. It seems all files are affected. Please find this one attached. It should contain the 3D electric field only at time = 0 and it corresponds to the error message shown above.

https://drive.google.com/drive/folders/1aF_vLrcfsfzkg7M8E0Ku1Xmnuxcre8_v?usp=share_link

For me the command returns:

bpls data.0 
Failed to open with BPFile engine: [Fri May  5 09:49:08 2023] [ADIOS2 EXCEPTION] <Toolkit> <format::bp::BP3Deserializer> <ParseMinifooter> : ADIOS2 only supports bp format version 3 and above, found 0 version

Failed to open with HDF5 engine: [Fri May  5 09:49:08 2023] [ADIOS2 EXCEPTION] <Engine> <HDF5ReaderP> <HDF5ReaderP> : Invalid HDF5 file found
psychocoderHPC commented 1 year ago

@franzpoeschel Could you help here, please?

franzpoeschel commented 1 year ago

To me this looks like you are trying to read the wrong path. By default, PIConGPU writes one ADIOS dataset per output time step, e.g. simData_00000.bp for time step 0. This is what you should use with bpls:

> bpls simData_000000.bp
  float     /data/0/fields/E/x                          {200, 800, 200}
  float     /data/0/fields/E/y                          {200, 800, 200}
  float     /data/0/fields/E/z                          {200, 800, 200}
  uint64_t  /data/0/fields/picongpu_idProvider/nextId   {1, 2, 1}
  uint64_t  /data/0/fields/picongpu_idProvider/startId  {1, 2, 1}

In your log above, it looks like you are pointing the openPMD Series only to the folder containing those files:

---> 11 ts_fieldE    = OpenPMDTimeSeries(DIR_fieldE    + "/simOutput/openPMD/");  

In order to properly refer to the Series of ADIOS files, you would need to specify the file name pattern:

Series(DIR_fieldE + "/simOutput/openPMD/simData_%T.bp", Access::READ_ONLY)

(This behavior is different from the openPMD-viewer which has a different workflow because it was first written before the openPMD-api was in a usable state)

PrometheusPi commented 1 year ago

Thanks for uploading an example file @cbontoiu. As the error message already (partially) says, the file md.idx does not fulfill the ADIOS2 standard of being a multiple of 64 bytes. I would say something went wrong after writing the version (my uneducated impression: the byte code after the version string looks different).

We did not use ADIOS2.9 a lot (at all?) so far. Do you get the same problem when using an older ADIOS? (like 2.7.1)

cbontoiu commented 1 year ago

I get Segmentation fault as in the image attached. Observations:

image

franzpoeschel commented 1 year ago

We did not use ADIOS2.9 a lot (at all?) so far. Do you get the same problem when using an older ADIOS? (like 2.7.1)

Good catch, the uploaded file is a BP5 file, which implies to me that @cbontoiu used ADIOS2 v2.9 for writing, but an older version of openPMD-api (otherwise, the file extension would be either .bp4 or .bp5).

For reading a BP5 file, you need an ADIOS2 version >= v2.9.0 (you can check with bpls --version), and for proper support of BP5 you need at least openPMD-api 0.15 (the latest subreleases of openPMD-api 0.14 have preliminary support for BP5).

Unfortunately, ADIOS2 v2.9 writes BP5 files by default, so it seems that you unwillingly got upgraded to a newer version. Two solutions:

  1. Upgrade ADIOS2 to v2.9 and openPMD-api to 0.15 and keep using BP5. BP5 is generally easier to use and has a number of optimizations which make the switch worth it.
  2. If you want to keep using BP4, specify --openPMD.json '{"adios2":{"engine":{"type": "bp4"}}}' at write time.
cbontoiu commented 1 year ago

Thank you for your help. For me

bpls --version returns 2.8.3 so I will go with option 1

I don't know how to check the version of openPMD-api, but I installed the dev from here https://github.com/openPMD/openPMD-api so it should be the latest available.

cbontoiu commented 1 year ago

I have now ADIOS 2.9 and openpMD-api 0.15. Here is my new error which is triggered as the compilation completes and runtime starts:

   rm -r .build/ && pic-build &> build_log.txt && tbg -s bash -c etc/picongpu/runConfiguration.cfg -t etc/picongpu/bash/mpiexec.tpl /media/...
Running program...
tbg/submit.start: line 35: /home/quasar/picongpu.profile: No such file or directory
/media/.../input/bin/picongpu: error while loading shared libraries: libadios2_cxx11_mpi.so.2.9: cannot open shared object file: No such file or directory
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:  Process name: [[31653,1],0]  Exit code:    127
cbontoiu commented 1 year ago

And here is the log file

build directory: .build
cmake command: cmake  -DCMAKE_INSTALL_PREFIX=/home/quasar/LASER3DEFFDENSCNT/input -DPIC_EXTENSION_PATH=/home/quasar/LASER3DEFFDENSCNT/input   -Dalpaka_ACC_GPU_CUDA_ENABLE=ON -Dalpaka_ACC_GPU_CUDA_ONLY_MODE=ON -Dalpaka_CUDA_EXPT_EXTENDED_LAMBDA=ON -DCMAKE_CUDA_ARCHITECTURES="75" /home/quasar/src/picongpu-dev/include/picongpu
-- The C compiler identification is GNU 11.1.0
-- The CXX compiler identification is GNU 11.1.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda-11.8/bin/nvcc
-- The CUDA compiler identification is NVIDIA 11.8.89
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-11.8/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Found Boost: /home/quasar/lib/boost/lib/cmake/Boost-1.74.0/BoostConfig.cmake (found suitable version "1.74.0", minimum required is "1.74.0") found components: atomic 
-- C++20 math constants not found. Falling back to non-standard constants.
-- Found CUDAToolkit: /usr/local/cuda-11.8/include (found version "11.8.89") 
-- nvcc is used as CUDA compiler
-- alpaka_ACC_GPU_CUDA_ONLY_MODE
-- alpaka_ACC_GPU_CUDA_ENABLED

List of compiler flags added by alpaka
device compiler:
    $<$<COMPILE_LANGUAGE:CUDA>:--extended-lambda>;$<$<COMPILE_LANGUAGE:CUDA>:--expt-relaxed-constexpr>;$<$<COMPILE_LANGUAGE:CUDA>:-Xcudafe=--display_error_number>;$<$<COMPILE_LANGUAGE:CUDA>:-Xcudafe=--diag_suppress=esa_on_defaulted_function_ignored>

-- Looking for std::filesystem::path::preferred_separator
-- Looking for std::filesystem::path::preferred_separator - found
-- Found MPI_C: /usr/local/openmpi/lib/libmpi.so (found version "3.1") 
-- Found MPI_CXX: /usr/local/openmpi/lib/libmpi.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- Found Boost: /home/quasar/lib/boost/lib/cmake/Boost-1.74.0/BoostConfig.cmake (found suitable version "1.74.0", minimum required is "1.74") found components: program_options 
-- Boost: deactivate std::auto_ptr
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Found Boost: /home/quasar/lib/boost/lib/cmake/Boost-1.74.0/BoostConfig.cmake (found suitable version "1.74.0", minimum required is "1.65.1")  
-- Using mallocMC from thirdParty/ directory
-- Found mallocMC: /home/quasar/src/picongpu-dev/thirdParty/mallocMC/src (found suitable version "2.6.0", minimum required is "2.6.0")  
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Boost: /home/quasar/lib/boost/lib/cmake/Boost-1.74.0/BoostConfig.cmake (found suitable version "1.74.0", minimum required is "1.74.0") found components: program_options 
-- Could NOT find NVML (missing: NVML_INCLUDE_DIR) 
-- nvml found
-- Found Boost: /home/quasar/lib/boost/lib/cmake/Boost-1.74.0/BoostConfig.cmake (found suitable version "1.74.0", minimum required is "1.66.0") found components: program_options 
-- Found ADIOS2: /usr/local/adios2/lib/cmake/adios2/adios2-config.cmake (found version "2.9.0") found components: C CXX MPI 
-- Found openPMD: /usr/local/lib/cmake/openPMD
-- Using the single-header code from /home/quasar/src/picongpu-dev/thirdParty/nlohmann_json/single_include/
-- Implicit conversions are disabled
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11")  
-- Found PNG: /usr/lib/x86_64-linux-gnu/libpng.so (found version "1.6.37") 
-- Found PNGwriter: /home/quasar/lib/pngwriter/lib/cmake/PNGwriter
-- Could NOT find ISAAC - set ISAAC_DIR or check your CMAKE_PREFIX_PATH

Optional Dependencies:
  openPMD: ON
  PNGwriter: ON
  ISAAC: OFF

-- Configuring done (4.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/quasar/LASER3DEFFDENSCNT/input/.build
[  4%] Building CUDA object cupla/CMakeFiles/cupla.dir/src/manager/Driver.cpp.o
[  4%] Building CUDA object cupla/CMakeFiles/cupla.dir/src/common.cpp.o
[  8%] Building CXX object CMakeFiles/picongpu-hostonly.dir/ArgsParser.cpp.o
[  8%] Building CUDA object cupla/CMakeFiles/cupla.dir/src/device.cpp.o
[ 10%] Building CUDA object build_cuda_memtest/CMakeFiles/cuda_memtest.dir/tests.cpp.o
[ 12%] Building CUDA object cupla/CMakeFiles/cupla.dir/src/event.cpp.o
[ 17%] Building CXX object CMakeFiles/picongpu-hostonly.dir/initialization/ParserGridDistribution.cpp.o
[ 17%] Building CUDA object build_cuda_memtest/CMakeFiles/cuda_memtest.dir/misc.cpp.o
[ 23%] Building CUDA object cupla/CMakeFiles/cupla.dir/src/stream.cpp.o
[ 23%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/common/MPIHelpers.cpp.o
[ 25%] Building CUDA object cupla/CMakeFiles/cupla.dir/src/memory.cpp.o
[ 25%] Building CUDA object build_cuda_memtest/CMakeFiles/cuda_memtest.dir/cuda_memtest.cpp.o
[ 27%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/common/stringHelpers.cpp.o
[ 29%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/misc/ComponentNames.cpp.o
[ 31%] Building CXX object build_mpiInfo/CMakeFiles/mpiInfo.dir/mpiInfo.cpp.o
[ 34%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/misc/removeSpaces.cpp.o
[ 36%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/misc/splitString.cpp.o
[ 38%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/openPMD/Json.cpp.o
[ 40%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/openPMD/openPMDWriter.cpp.o
[ 42%] Building CXX object CMakeFiles/picongpu-hostonly.dir/plugins/openPMD/toml.cpp.o
[ 44%] Building CXX object CMakeFiles/picongpu-hostonly.dir/random/seed/Seed.cpp.o
[ 46%] Linking CXX executable mpiInfo
[ 46%] Built target mpiInfo
[ 48%] Linking CUDA executable cuda_memtest
[ 48%] Built target cuda_memtest
[ 51%] Linking CXX static library libcupla.a
[ 51%] Built target cupla
[ 59%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/eventSystem.cpp.o
[ 59%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/dataManagement/DataConnector.cpp.o
[ 59%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/Manager.cpp.o
[ 59%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/communication/CommunicatorMPI.cpp.o
[ 61%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/events/CudaEvent.cpp.o
[ 63%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/events/CudaEventHandle.cpp.o
[ 65%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/events/EventNotify.cpp.o
[ 68%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/events/EventTask.cpp.o
[ 70%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/streams/EventStream.cpp.o
[ 72%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/tasks/StreamTask.cpp.o
[ 74%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/tasks/TaskKernel.cpp.o
[ 76%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/transactions/Transaction.cpp.o
[ 78%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/eventSystem/transactions/TransactionManager.cpp.o
[ 80%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/mappings/simulation/Filesystem.cpp.o
[ 82%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/misc/splitString.cpp.o
[ 85%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/pluginSystem/PluginConnector.cpp.o
[ 87%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/simulationControl/SimulationHelper.cpp.o
[ 89%] Building CUDA object CMakeFiles/pmacc.dir/home/quasar/src/picongpu-dev/include/pmacc/simulationControl/signal.cpp.o
[ 91%] Linking CXX static library libpicongpu-hostonly.a
[ 91%] Built target picongpu-hostonly
[ 93%] Linking CXX static library libpmacc.a
[ 93%] Built target pmacc
[ 95%] Building CUDA object CMakeFiles/picongpu.dir/main.cpp.o
[ 97%] Building CUDA object CMakeFiles/picongpu.dir/versionFormat.cpp.o
/home/quasar/src/picongpu-dev/include/picongpu/../picongpu/plugins/openPMD/openPMDWriter.hpp: In member function ‘void picongpu::openPMD::openPMDWriter::writeRngStates(picongpu::openPMD::ThreadParams*)’:
/home/quasar/src/picongpu-dev/include/picongpu/../picongpu/plugins/openPMD/openPMDWriter.hpp:743:35: warning: ‘std::shared_ptr<_Tp> openPMD::shareRaw(T*) [with T = char]’ is deprecated: For storing/loading data via raw pointers use storeChunkRaw<>()/loadChunkRaw<>() [-Wdeprecated-declarations]
  743 |                 mrc.storeChunk(
      |                 ~~~~~~~~~~~~~~~   ^       
/usr/local/include/openPMD/auxiliary/ShareRaw.hpp:49:1: note: declared here
   49 | shareRaw(T *x)
      | ^~~~~~~~
/home/quasar/src/picongpu-dev/include/picongpu/../picongpu/plugins/openPMD/openPMDWriter.hpp: In member function ‘void picongpu::openPMD::openPMDWriter::loadRngStatesImpl(picongpu::openPMD::ThreadParams*)’:
/home/quasar/src/picongpu-dev/include/picongpu/../picongpu/plugins/openPMD/openPMDWriter.hpp:792:34: warning: ‘std::shared_ptr<_Tp> openPMD::shareRaw(T*) [with T = char]’ is deprecated: For storing/loading data via raw pointers use storeChunkRaw<>()/loadChunkRaw<>() [-Wdeprecated-declarations]
  792 |                 mrc.loadChunk(
      |                 ~~~~~~~~~~~~~~   ^       
/usr/local/include/openPMD/auxiliary/ShareRaw.hpp:49:1: note: declared here
   49 | shareRaw(T *x)
      | ^~~~~~~~
[100%] Linking CXX executable picongpu
/usr/bin/ld: warning: libadios2_cxx11_mpi.so.2.9, needed by /usr/local/lib/libopenPMD.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libadios2_cxx11.so.2.9, needed by /usr/local/lib/libopenPMD.so, not found (try using -rpath or -rpath-link)
[100%] Built target picongpu
Install the project...
-- Install configuration: "Release"
-- Installing: /home/quasar/LASER3DEFFDENSCNT/input/bin/cuda_memtest
-- Installing: /home/quasar/LASER3DEFFDENSCNT/input/bin/mpiInfo
-- Set runtime path of "/home/quasar/LASER3DEFFDENSCNT/input/bin/mpiInfo" to ""
-- Installing: /home/quasar/LASER3DEFFDENSCNT/input/bin/picongpu
-- Set runtime path of "/home/quasar/LASER3DEFFDENSCNT/input/bin/picongpu" to "$ORIGIN:/usr/local/openmpi/lib:/usr/local/lib:/usr/local/adios2/lib:/usr/local/cuda-11.8/targets/x86_64-linux/lib:/home/quasar/lib/boost/lib"
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/egetopt
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/pic-build
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/cuda_memtest.sh
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/picongpu-completion.bash
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/pic-create
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/pic-edit
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/tbg
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/pic-compile
-- Up-to-date: /home/quasar/LASER3DEFFDENSCNT/input/bin/pic-configure
/home/quasar/LASER3DEFFDENSCNT/input
franzpoeschel commented 1 year ago
-- Found ADIOS2: /usr/local/adios2/lib/cmake/adios2/adios2-config.cmake (found version "2.9.0") found components: C CXX MPI 

It seems that you installed ADIOS2 to a custom directory /usr/local/adios2 instead of the default /usr/local and the linker does not consider that path. For a quick fix, you should be able to do export LD_LIBRARY_PATH=/usr/local/adios2/lib:$LD_LIBRARY_PATH (maybe also lib64), for a more permanent fix, I'd recommend installing ADIOS2 to the same directory that you also installed openPMD-api to.

cbontoiu commented 1 year ago

@franzpoeschel thnak you. My bashrc file contains

#  adios 2--------------------------------------------------
export PATH="/usr/local/adios2/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/adios2/lib:$LD_LIBRARY_PATH"

I noticed a possible mismatch of names. PIConGPU looks for libadios2_cxx11_mpi.so.2.9 but I have libadios2_cxx11_mpi.so.2.9.0

image

franzpoeschel commented 1 year ago

libadios2_cxx11.so.2.9.0 and libadios2_cxx_mpi.so.2.9.0 are two distinct shared libraries, both present in your lib folder. Apart from that, this is an issue of your local environment which is hard to narrow down remotely. The only things that I can recommend at this point is to

  1. explicitly set the LD_LIBRARY_PATH before launching PIConGPU, and inspecting its value
  2. or installing ADIOS2 to the normal install location
  3. or doing a clean reinstall
cbontoiu commented 1 year ago

the first option didn't help

image

cbontoiu commented 1 year ago

$LD_LIBRARY_PATH


bash: /usr/local/adios2/lib:/usr/local/adios2/lib:/home/quasar/lib/pngwriter//lib:/usr/local/adios2/lib:/usr/local/sz/lib:/home/quasar/lib/openPMD-api-0.15.1/lib:/home/quasar/lib/boost/lib::/usr/local/c-blosc2/lib:/usr/local/cuda-11.8/lib64:/usr/local/openmpi/lib: No such file or directory```
cbontoiu commented 1 year ago

@franzpoeschel and @PrometheusPi I reinstalled ADIOS 2 and openPMD api in the same folder as shown in the image attached. The error is the same. Please could you check if picongpu-dev is looking for the correct adios2 library name, given the 2.9.0 and 2.9 similarity shown above? Thank you.

image

franzpoeschel commented 1 year ago

I regularly use PIConGPU with openPMD-api 0.15 and ADIOS2 v2.9 without any issues:

$ ldd `which picongpu`                                                                                                                                                                                                                                                             
        linux-vdso.so.1 (0x00007ffcc0d71000)                                                                                                                                                                                                                                                                                  
        libmpi.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libmpi.so.40 (0x00007fcc8274a000)                                                                                                                                                                                                       
        libm.so.6 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libm.so.6 (0x00007fcc8266a000)                                                                                                                                                                                                            
        libboost_serialization.so.1.79.0 => /nix/store/havsqnin5jb5mbkaqw04azl7yqv8mx8y-boost-1.79.0/lib/libboost_serialization.so.1.79.0 (0x00007fcc82624000)                                                                                                                                                                
        libopenPMD.so => /nix/store/54b8irxgcfbfh7zsh2spsxv6g8bylr37-python3.10-openPMD-api-0.15.1/lib/libopenPMD.so (0x00007fcc820d3000)                                                                                                                                                                                     
        libadios2_cxx11_mpi.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_cxx11_mpi.so.2 (0x00007fcc820ca000)                                                                                                                                                                               
        libadios2_cxx11.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_cxx11.so.2 (0x00007fcc81f3c000)                                                                                                                                                                                       
        librt.so.1 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/librt.so.1 (0x00007fcc81f37000)                                                                                                                                                                                                          
        libcudart.so.11.0 => /nix/store/cfwcn5kvvcg2j13hvf9cv7siwvkjgvni-cudatoolkit-11.7.0-lib/lib/libcudart.so.11.0 (0x00007fcc81c00000)                                                                                                                                                                                    
        libboost_program_options.so.1.79.0 => /nix/store/havsqnin5jb5mbkaqw04azl7yqv8mx8y-boost-1.79.0/lib/libboost_program_options.so.1.79.0 (0x00007fcc81ec8000)                                                                                                                                                            
        libboost_filesystem.so.1.79.0 => /nix/store/havsqnin5jb5mbkaqw04azl7yqv8mx8y-boost-1.79.0/lib/libboost_filesystem.so.1.79.0 (0x00007fcc81bdb000)                                                                                                                                                                      
        libboost_atomic.so.1.79.0 => /nix/store/havsqnin5jb5mbkaqw04azl7yqv8mx8y-boost-1.79.0/lib/libboost_atomic.so.1.79.0 (0x00007fcc81ebe000)                                                                                                                                                                              
        libboost_system.so.1.79.0 => /nix/store/havsqnin5jb5mbkaqw04azl7yqv8mx8y-boost-1.79.0/lib/libboost_system.so.1.79.0 (0x00007fcc81eb9000)                                                                                                                                                                              
        libboost_math_tr1.so.1.79.0 => /nix/store/havsqnin5jb5mbkaqw04azl7yqv8mx8y-boost-1.79.0/lib/libboost_math_tr1.so.1.79.0 (0x00007fcc81b7b000)                                                                                                                                                                          
        libpthread.so.0 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libpthread.so.0 (0x00007fcc81eb2000)                                                                                                                                                                                                
        libdl.so.2 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libdl.so.2 (0x00007fcc81ead000)                                                                                                                                                                                                          
        libstdc++.so.6 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libstdc++.so.6 (0x00007fcc81965000)                                                                                                                                                                                                  
        libgomp.so.1 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libgomp.so.1 (0x00007fcc81924000)                                                                                                                                                                                                      
        libgcc_s.so.1 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libgcc_s.so.1 (0x00007fcc8190a000)                                                                                                                                                                                                    
        libc.so.6 => /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/libc.so.6 (0x00007fcc81701000)                                                                                                                                                                                                            
        /nix/store/9xfad3b5z4y00mzmk2wnn4900q0qmxns-glibc-2.35-224/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007fcc82a7f000)                                                                                                                                                                               
        libpsm2.so.2 => /nix/store/9hj5fhj0fpfxcsiyyh36c1jz2bh6ab2p-libpsm2-11.2.229/lib/libpsm2.so.2 (0x00007fcc81697000)                                                                                                                                                                                                    
        libopen-rte.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libopen-rte.so.40 (0x00007fcc8156f000)                                                                                                                                                                                             
        libopen-pal.so.40 => /nix/store/zidndx02ksdqv2szkwgxymb42s5gimfj-openmpi-4.1.4/lib/libopen-pal.so.40 (0x00007fcc8142b000)                                                                                                                                                                                             
        libucp.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucp.so.0 (0x00007fcc812ec000)                                                                                                                                                                                                            
        libuct.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libuct.so.0 (0x00007fcc812a3000)                                                                                                                                                                                                            
        libucm.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucm.so.0 (0x00007fcc81285000)                                                                                                                                                                                                            
        libucs.so.0 => /nix/store/mzfrxasizd3i38w02sa6i7xd8gd5r2i4-ucx-1.13.1/lib/libucs.so.0 (0x00007fcc81218000)                                                                                                                                                                                                            
        libfabric.so.1 => /nix/store/jv6kda0z8m9kw5kvs8inhdgxwasp431f-libfabric-1.15.1/lib/libfabric.so.1 (0x00007fcc81106000)                                                                                                                                                                                                
        librdmacm.so.1 => /nix/store/bl6qfz0vqf4l9zd3hx0y29v7rvym6b8p-rdma-core-43.0/lib/librdmacm.so.1 (0x00007fcc810e6000)                                                                                                                                                                                                  
        libibverbs.so.1 => /nix/store/bl6qfz0vqf4l9zd3hx0y29v7rvym6b8p-rdma-core-43.0/lib/libibverbs.so.1 (0x00007fcc810c3000)                                                                                                                                                                                                
        libpmix.so.2 => /nix/store/f80qm7xlg6q4rh9hd35rxll6vhxk3qvb-pmix-3.2.3/lib/libpmix.so.2 (0x00007fcc80fcf000)                                                                                                                                                                                                          
        libnl-3.so.200 => /nix/store/i5k5d396psw59zvgmy9r6qzmsckgz2vh-libnl-3.7.0/lib/libnl-3.so.200 (0x00007fcc80fab000)                                                                                                                                                                                                     
        libnl-route-3.so.200 => /nix/store/i5k5d396psw59zvgmy9r6qzmsckgz2vh-libnl-3.7.0/lib/libnl-route-3.so.200 (0x00007fcc80f18000)                                                                                                                                                                                         
        libz.so.1 => /nix/store/fblaj5ywkgphzpp5kx41av32kls9256y-zlib-1.2.13/lib/libz.so.1 (0x00007fcc80efa000)                                                                                                                                                                                                               
        libhwloc.so.15 => /nix/store/jwbh8kj703ns9p7cdcsxg2kl1ggaw7va-hwloc-2.8.0-lib/lib/libhwloc.so.15 (0x00007fcc80e9a000)                                  
        libevent_core-2.1.so.7 => /nix/store/icmm0jx9al1dhr60fh4mmvi5sqxl6wh9-libevent-2.1.12/lib/libevent_core-2.1.so.7 (0x00007fcc80e63000)
        libevent_pthreads-2.1.so.7 => /nix/store/icmm0jx9al1dhr60fh4mmvi5sqxl6wh9-libevent-2.1.12/lib/libevent_pthreads-2.1.so.7 (0x00007fcc80e5e000)
        libhdf5.so.100.1.0 => /nix/store/skqp7rnc98qyslxg8231s8yhg4p8483w-hdf5-1.10.1/lib/libhdf5.so.100.1.0 (0x00007fcc80a87000)
        libadios2_core_mpi.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_core_mpi.so.2 (0x00007fcc80a05000)
        libadios2_core.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_core.so.2 (0x00007fcc80254000)
        libnuma.so.1 => /nix/store/94kqdwqz1qdlcv5y07hsrs0z1a5dgqpd-numactl-2.0.16/lib/libnuma.so.1 (0x00007fcc80246000)                                       
        libbfd-2.39.so => /nix/store/7c8vx9wngib658cfx5pnnfi370a37ppm-libbfd-2.39/lib/libbfd-2.39.so (0x00007fcc7fb69000)                                      
        libatomic.so.1 => /nix/store/b13h86pg7lbf6vpc1vwzw6akmakyw1bs-gcc-11.3.0-lib/lib/libatomic.so.1 (0x00007fcc7fb5e000)                                   
        libadios2_perfstubs.so => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_perfstubs.so (0x00007fcc7fb58000)
        libblosc2.so.2 => /nix/store/nagq9kg0b6m2yrxn30v15pz5sa44w3f1-blosc2-v2.4.3/lib/libblosc2.so.2 (0x00007fcc7f9b1000)                                    
        libbz2.so.1 => /nix/store/61rpfcaxhyqfmnk5qp4z7hf20wh9zgrk-bzip2-1.0.8/lib/libbz2.so.1 (0x00007fcc7f99e000)                                            
        libadios2_evpath.so => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_evpath.so (0x00007fcc7f924000)
        libadios2_ffs.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_ffs.so.2 (0x00007fcc7f8bb000)
        libadios2_atl.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_atl.so.2 (0x00007fcc7f8aa000)
        libadios2_dill.so.2 => /nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib//../../../..//nix/store/4jpc9p41sca0l244bhq83icgjwyjd964-adios2-v2.9.0/lib/libadios2_dill.so.2 (0x00007fcc7f857000)

This is very likely an issue with your environment and not with PIConGPU. Please try a clean rebuild of PIConGPU (i.e. delete the build folder and build anew). You should have a look at the configuration output to verify that the correct installations of openPMD-api and ADIOS2 are picked up.

Can you execute openpmd-ls and bpls without issues? What are the outputs of openpmd-ls --version and bpls --version?

Also, the output of ldd on your PIConGPU binary (as I did above), on the full path of either openpmd-ls or libopenPMD.so and on the full path of bpls might help figuring out if there are broken packages somewhere on your system that PIConGPU tries linking to.

cbontoiu commented 1 year ago

@franzpoeschel thanks for the suggestion. Indeed this is a problem with my openPMD-api installation

openpmd-ls --version openpmd-ls: error while loading shared libraries: libadios2_cxx11_mpi.so.2.9: cannot open shared object file: No such file or directory bpls --version 2.9.0

I will try to reinstall it. Maybe it doesn't match with my ADIOS2 installation.

cbontoiu commented 1 year ago

@franzpoeschel I solved the issue by installing the dev version of openPMD-api. PIConGPU runs now. Thank you so much for your help. I will close this ticket.

quasar@ubuntu:~$ openpmd-ls --version
openpmd-ls (openPMD-api) 0.16.0-dev
Copyright 2017-2021 openPMD contributors
Authors: Axel Huebl et al.
License: LGPLv3+
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
cbontoiu commented 1 year ago

@PrometheusPi and @franzpoeschel

Unfortunately, although now PIConGPU runs, my data is not OK. What confuses me is that although I installed ADIOS 2.9.0 with bp5 supported and openPMDapi-0.16

bpls simData_000000.bp gives me Segmentation fault (core dumped) and when reading data through the openPMDviewer in JupyterLab with openPMDapi-0.15.1 I get as before, _FATAL CODING ERROR: ADIOS Index file simData000000.bp is assumed to always contain n*64 byte-length records. The file size now is 162 bytes.

in addition bpls --version returns 2.8.3. Any help is more than welcome. Thank you.

franzpoeschel commented 1 year ago

If you installed ADIOS2 v2.9, but bpls --version returns 2.8.3, then there is still something wrong with your environment, and you possibly have installed ADIOS2 twice, maybe one or both of them broken.

There is an issue with BP5 and Blosc2 compression that creates unreadable files, but the error message would be a different one.

when reading data through the openPMDviewer in JupyterLab with openPMDapi-0.15.1 I get as before, _FATAL CODING ERROR: ADIOS Index file simData000000.bp is assumed to always contain n*64 byte-length records. The file size now is 162 bytes.

I believe that in the Pip package the ADIOS2 version is still v2.7 unfortunately which does not support BP5 yet.

cbontoiu commented 1 year ago

@franzpoeschel Thank you for clarifications. For me the openPMDviewer is essential so if for the time being I cannot use to read bp5 data, I need to return to bp4. Which is the right combination? Will this work?

You mentioned If you want to keep using BP4, specify --openPMD.json '{"adios2":{"engine":{"type": "bp4"}}}' at write time. Where should I type this? In my .cfg file? at the terminal after the tbg command?

Ideally I will reinstall the right combination to make the whole process (picongpu run + data processing through openPMDviewer) work again.

franzpoeschel commented 1 year ago

If you use openPMD-api 0.15, then you can select BP4 by specifying the filename extension .bp4 (--openPMD.ext bp4), there is no need to reinstall older versions as BP4 remains available, it is just no longer the standard since ADIOS2 v2.9. The JSON parameter above is only necessary if you use an old installation of openPMD-api (0.14) in combination with a new installation of ADIOS2 (v2.9), which you no longer do. (The easiest way to apply this configuration would be to write the JSON string to a file and pass --openPMD.json @path/to/file.json as a command line argument of PIConGPU, e.g. right next to --openPMD.period 100 or so)

Note that I will be on vacation starting tomorrow, so someone else will need to continue here

cbontoiu commented 1 year ago

All good now! Part of the problem was my Conda installation with a different version of ADIOS 2. Problem solved. I am grateful for your help!