joaander commented 1 year ago

Description

Allow gsd to provide high bandwidth writes while ensuring file integrity.

Proposed solution

Split gsd_end_frame() into gsd_end_frame() and gsd_commit().
Call gsd_commit() in gsd_close().

The new gsd_end_frame() will only increment the frame counter: https://github.com/glotzerlab/gsd/blob/aad91d2596f5f837921baf3346a612f85300423b/gsd/gsd.c#L1875-L1887

The new gsd_commit() will flush the buffers and sync: https://github.com/glotzerlab/gsd/blob/aad91d2596f5f837921baf3346a612f85300423b/gsd/gsd.c#L1889-L1954

Additional context

With this API, the caller can push many frames into the in-memory buffer and flush them all at an appropriate time (e.g. after buffering a full batch). Otherwise, the remaining buffer will be written when the file is closed.

232 introduced a per-frame `fsync` call to ensure data integrity in the file. This lowers performance on long latency filesystems. On Frontier's Orion, realistic uses in HOOMD-blue are limited to ~10 frames per second written to the file. Commenting out the `fsync` gives ~10x performance improvements, depending on system size as the writes are now bandwidth limited. This proposed API allows the caller to gain the bandwidth efficiency by batching many frames in memory and finally writing the buffer with `gsd_commit()`, ensuring data integrity for the batch.

As an API breaking change, this should be introduced in gsd 3.0.

joaander commented 1 year ago

@b-butler any thoughts or suggestions?

joaander commented 1 year ago

On second thought, this does not need to be API breaking. Make the new methods gsd_buffer_frame() and gsd_commit(). Then:

gsd_end_frame()
    {
    gsd_buffer_frame();
    gsd_commit();
    }

remains with the current behavior, but tools can opt-in to the new API.

b-butler commented 1 year ago

I like the proposed option of creating two new functions which gsd_end_frame performs, except perhaps that then gsd_buffer_frame also increments the frame number preventing new data to be written to that frame.

joaander commented 1 year ago

Yes, the buffer frame method would increment the frame counter. Maybe a better name is gsd_buffer_end_frame() / gsd_buffer_commit().

joaander commented 1 year ago

Unfortunately, the implementation is not so simple as I wrote in the description. The commit call should not write out partial frame data that has not been ended. Otherwise, a killed job might include an incomplete frame at the end. Therefore, we will need another index buffer to track the buffered index writes separate from those that are specifically buffered for the current (partial frame).

On second thought, I would like to provide high performance for all users. GSD versions 2.0 - 2.8.0 buffered output (at the OS's discretion), so we could release 2.9 (or call it 3.0 if concerned) that buffers output internally and requires either a gsd_flush or a gsd_close to write the final buffered data. I'm working on this now. If possible, I will include an explicit API that dupin can use to attain higher performance. At a minimum, I will make the buffer size configurable by the caller.

One challenge with this is that Python treats SIGTERM as an instant kill (https://stackoverflow.com/questions/9930576/python-what-is-the-default-handling-of-sigterm). SLURM (and I presume other job queues) send SIGTERM first, then SIGKILL after a timeout (SLURM's default is 30s). For users to obtain partial output from a cancelled job, they will either need to flush manually, or set a SIGTERM handler that allows a graceful shutdown (implicitly calling gsd_close() in a destructor for example). We could aid users by installing such a handler by default when importing gsd and/or HOOMD.

The HOOMD tutorials that write a gsd file and read in the same notebook will need to call flush.

joaander commented 1 year ago

Completed in #237.

glotzerlab / gsd

Provide high bandwidth performance for bulk frame writes. #236

Description

Proposed solution

Additional context