HDFGroup / hdf5

Official HDF5® Library Repository
https://www.hdfgroup.org/
Other
601 stars 243 forks source link

Serial HDF5 I/O problem in MPI program #4601

Closed HiSPEET closed 3 months ago

HiSPEET commented 3 months ago

In my MPI program each process reads and writes data independently from/into different files using serial HDF5. With one process all works fine. Using two processes it fails in HDF5 with the error messages

HDF5-DIAG: Error detected in HDF5 (1.14.4-2) thread 0:
  #000: H5D.c line 403 in H5Dopen2(): unable to synchronously open dataset
    major: Dataset
HDF5-DIAG: Error detected in HDF5 (1.14.4-2) thread 0:
  #000: H5D.c line 403 in H5Dopen2(): unable to synchronously open dataset
    major: Dataset
    minor: Can't open object
  #001: H5D.c line 359 in H5D__open_api_common(): can't set object access arguments
    major: Dataset
    minor: Can't set value
  #002: H5VLint.c line 2602 in H5VL_setup_acc_args(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
  #003: H5VLint.c line 1741 in H5VL_vol_object(): invalid identifier type to function
    minor: Can't open object
    major: Invalid arguments to routine
    minor: Inappropriate type
  #001: H5D.c line 359 in H5D__open_api_common(): can't set object access arguments
    major: Dataset
    minor: Can't set value
  #002: H5VLint.c line 2602 in H5VL_setup_acc_args(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
  #003: H5VLint.c line 1741 in H5VL_vol_object(): invalid identifier type to function
    major: Invalid arguments to routine
    minor: Inappropriate type
  ...

Both processes actually read from different files, test-hagen_poiseuille-inbuilt-restart_run_1_mesh_0.h5 and test-hagen_poiseuille-inbuilt-restart_run_1_mesh_1.h5. The sequence of calls is as follows

  call H5open(err)
  call H5Fopen_f(file_pr, H5F_ACC_RDWR_F, file_id, err)  ! works
  call H5Gopen_f(file_id, '/mesh', group_id, err)        ! works
  call H5Dopen_f(group_id, 'part', data_id, err)         ! fails

I am not sure if this is an error. However, both files look fine when viewing them with h5dump or HDFView. Maybe I should use parallel HDF5?

So far I tested the code with

In the mean time I performed additional tests under Linux with

The output is similar

HDF5-DIAG: Error detected in HDF5 (1.14.0) MPI-process 1:
  #000: H5D.c line 402 in H5Dopen2(): unable to synchronously open dataset
    major: Dataset
HDF5-DIAG: Error detected in HDF5 (1.14.0) MPI-process 0:
  #000: H5D.c line 402 in H5Dopen2(): unable to synchronously open dataset
    major: Dataset
    minor: Can't open object
  #001: H5D.c line 358 in H5D__open_api_common(): can't set object access arguments
    major: Dataset
    minor: Can't set value
  #002: H5VLint.c line 2669 in H5VL_setup_acc_args(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
    minor: Can't open object
  #001: H5D.c line 358 in H5D__open_api_common(): can't set object access arguments
    major: Dataset
    minor: Can't set value
  #002: H5VLint.c line 2669 in H5VL_setup_acc_args(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
HiSPEET commented 3 months ago

The error was caused by a wrong name of the input file. After fixing the latter the example worked fine.