lanl / vpic

Vector Particle-In-Cell (VPIC) Project
Other
152 stars 76 forks source link

Error in HDF5 writing of attribute field #112

Closed brtnfld closed 4 years ago

brtnfld commented 4 years ago

HDF5 attributes writen to the file can be incorrect depending on which process writes last. Take for example writing the field attribute VPIC-ArrayUDF-GEO,

VPIC does (by all the processes):

  float attr_data[2][3];
    attr_data[0][0] = grid->x0;
    attr_data[0][1] = grid->y0;
    attr_data[0][2] = grid->z0;
    attr_data[1][0] = grid->dx;
    attr_data[1][1] = grid->dy;
    attr_data[1][2] = grid->dz;
    hsize_t dims[2];
    dims[0] = 2;
    dims[1] = 3;
    hid_t va_geo_dataspace_id = H5Screate_simple(2, dims, NULL);
    hid_t va_geo_attribute_id = H5Acreate2(file_id, "VPIC-ArrayUDF-GEO", H5T_IEEE_F32BE, va_geo_dataspace_id, H5P_DEFAULT, H5P_DEFAULT);
    H5Awrite(va_geo_attribute_id, H5T_NATIVE_FLOAT, attr_data);

but for the HarrisHDF5 case, grid->y0 can have different values between processes. Hyperslab selection can not be used with attributes, so all the processes write their entire array. Hence, the value that gets written to the attribute data is whichever process writes last.

rfbird commented 4 years ago

@petschge, did we ever decide what the best thing to do about this was?

petschge commented 4 years ago

I think the general agreement was to just remove the write alltogether

brtnfld commented 4 years ago

Any update if this is going to be merged? Also, should the hdf5_rebase branch be used instead of hdf5 for hdf5/DAOS VOL work?

rfbird commented 4 years ago

I made an attempt to address this in #122 , can you look at it please @petschge ?

brtnfld commented 4 years ago

FYI, with this branch, harrisHDF5 has the same output with the DAOS VOL when compared to native HDF5 output.

rfbird commented 4 years ago

Two questions:

1) "This branch", means the one with the potential fix from #122, right? 2) Does the same output mean "it fixed the issue and did not break anything else", or "it did not fix the issue"?

Thanks!

brtnfld commented 4 years ago

(1) yes, the hdf5_remove_attritue branch. (2) It fixed the issue of the output being different between the two different VOLs. This is what I do to check:

... first to run VPIC (harrisHDF5) using the HDF5 native VOL connector, and then using h5dump on all the output files. This gives the reference files to compare against. VPIC using H5DAOS VOL is run, and h5dump is used to generate the H5DOAS VOL reference files. For all the output dumps, the h5dump results are compared.

I ran ctest and the only test that failed was test_collision_script, but I have not checked if this fails only for this branch.

rfbird commented 4 years ago

OK, thanks. I'll merge it then