ofiwg / libfabric

Open Fabric Interfaces
http://libfabric.org/
Other
547 stars 375 forks source link

libfabric-2.0: separate max_msg_size for RMA #9968

Closed j-xiong closed 2 weeks ago

j-xiong commented 5 months ago

struct fi_ep_attr defines max_msg_size which applies to both send/recv and RMA. Some transport may want to have different size for RMA to better align with the hardware / driver features.

j-xiong commented 5 months ago

We could allow RMA max_msg_size to be 0 to take the value from msg max_msg_size.

j-xiong commented 5 months ago

OFIWG meeting notes: Sounds good.

shefty commented 5 months ago

Note, 'msg' in max_msg_size refers to the definition of a 'transport message', not 'msg' as in FI_MSG. A specific transport operation may actually be lower than max_msg_size, such as is usually the case for atomics.

If you separate the size of RMA operations out from max_msg_size, you're potentially redefining what max_msg_size means, such that it no longer applies to the transport message, but an API capability. This is especially true if the RMA size will be larger than the message size.

FWIW, this change is being driven by the hardware up, rather than application down, and is pushing the burden of dealing with the differences into every application rather than isolating the change in the provider.

j-xiong commented 5 months ago

We could argue that the transport msg sizes can have two different limits since the send/recv and RMA protocols may be very different.

shefty commented 5 months ago

The same is true for atomics and collectives. Even memory registration may have a separate size limit. Collectives and atomics define separate query operations for more precise limits. Tagged and untagged transfers typically use different protocols as well.

One of the points behind 2.0 was to remove differences between providers and simplify apps. This is going in the opposite direction. If we're arguing that RMA needs its own size, then split tagged and untagged as well. Then every capability has its own size.

Or, keep things simple for apps and make the providers deal with their own HW implementation nonsense.

j-xiong commented 2 weeks ago

Implemented in #10070