romio: Add basic GPU-awareness

pmodels / mpich

Official MPICH Repository

http://www.mpich.org

Other

541 stars 280 forks source link

romio: Add basic GPU-awareness #7108

Closed raffenet closed 3 weeks ago

raffenet commented 1 month ago

Pull Request Description

Allocate and use host buffers to perform I/O, if device buffers are detected. Fixes pmodels/mpich#7044.

For read APIs, the pattern is:

allocate host buffer
perform read
copy data to device

For write APIs:

allocate and copy data to host buffer
perform write
free host buffer

Author Checklist

[x] Provide Description Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
[x] Commits Follow Good Practice Commits are self-contained and do not do two things at once. Commit message is of the form: module: short description Commit message explains what's in the commit.
[x] Passes All Tests Whitespace checker. Warnings test. Additional tests via comments.
[x] Contribution Agreement For non-Argonne authors, check contribution agreement. If necessary, request an explicit comment from your companies PR approval manager.

raffenet commented 1 month ago

test:mpich/ch4/most test:mpich/ch3/most

raffenet commented 1 month ago

test:mpich/ch4/most test:mpich/ch3/most

wkliao commented 1 month ago

May I suggest to make this PR a configurable option, if it is not already? This will allow future development in ROMIO to incorporate GPU-DISK direct I/O.

raffenet commented 1 month ago

May I suggest to make this PR a configurable option, if it is not already? This will allow future development in ROMIO to incorporate GPU-DISK direct I/O.

At the moment, this can be enabled/disabled at runtime with the MPIR_CVAR_ENABLE_GPU environment variable from MPICH. We can extend the configurability with a ROMIO-specific setting to facilitate GPUDirect Storage or other GPU-aware development strategies, e.g. pipelined copying.

roblatham00 commented 1 month ago

this code will always allocate a host region? I was expecting to see something call MPL's "is this device memory?" routine

raffenet commented 1 month ago

this code will always allocate a host region? I was expecting to see something call MPL's "is this device memory?" routine

The ptr query happens in the MPIR_gpu_host_alloc, which is exposed thru the MPIR_Ext interface. https://github.com/pmodels/mpich/blob/deb8fa9f5790475657da697b43c36a8a58ed5d7d/src/include/mpir_gpu_util.h#L36-L49

raffenet commented 1 month ago

I did it that way since MPICH uses the MPIR_CVAR_ENABLE_GPU CVAR to control GPU-awareness. Reimplementing the same in ROMIO using MPL directly would be extra work, but it would help to enable standalone builds in the long-run.

raffenet commented 3 weeks ago

@roblatham00 thanks! I will merge this as-is. We can create an issue to track work on potential optimizations. The ones that come to mind are:

skip host buffer swap in the collective buffering case
pipelined copy-to-host, write-to-file and read-from-file, copy-to-device for large buffers