European-XFEL / DAMNIT

Data And Metadata iNspection Interactive Thing
https://damnit.rtfd.io
BSD 3-Clause "New" or "Revised" License
6 stars 2 forks source link

HDF5 file locking #325

Open takluyver opened 2 months ago

takluyver commented 2 months ago

Manage write access to HDF5 files using POSIX lockf file locking (equivalent to fcntl on Linux), which works over GPFS. I hope this will let us write incrementally to HDF5 files as variables are computed.

The writer thread waits to be sent some data, then acquires the lock and opens the HDF5 file. While it has the file open, it tries to do multiple writes if possible, but if nothing comes within 0.2 seconds (arbitrary choice), it closes the file and releases the lock, so another writer can take a turn.

I believe we need to reopen the file whenever another process may have modified it, because HDF5 can cache some data from open files, and that cache may become invalid if we keep the file open.

I made a demo of processes on two different hosts writing to the same file with the WriterThread class here, slowed down with sleep() calls to illustrate what's going on.

Screencast from 2024-08-30 14-23-55

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 91.45729% with 17 lines in your changes missing coverage. Please review.

Project coverage is 75.30%. Comparing base (c295f8d) to head (34f8e02).

Files with missing lines Patch % Lines
damnit/ctxsupport/damnit_h5write.py 87.09% 16 Missing :warning:
damnit/ctxsupport/ctxrunner.py 98.66% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #325 +/- ## ========================================== + Coverage 74.81% 75.30% +0.48% ========================================== Files 32 33 +1 Lines 4892 5066 +174 ========================================== + Hits 3660 3815 +155 - Misses 1232 1251 +19 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.