Open shefty opened 3 years ago
Would it help to create a new provider prov\tcp2
, which is strictly not compatible with prov\tcp
. Eventually you can deprecate the old one and free yourself of problems with version management. Or call it tcpng and keep it unstable until your are happy with the protocol.
I've considered that, but copy-paste-modify would almost certainly result in increased maintenance costs. The code overlap is too high. Plus, the existing protocol will likely need to be supported for years, similar to how we've never been able to completely rid ourselves of the socket provider.
My intent is that the code be updated such that only the newest protocol is handled, and that the conversion to the v3 protocol is done only when reading or writing to the socket. That would result in a slight overhead when using v3, but also means that optimizations made to the code would be usable with either version. For many apps, switching to only the latest protocol would be trivial. But there are some client-server apps (DAOS, DDN) where it would be more challenging.
My biggest concern is that the original CM protocol was not forward looking, so there's no way I can see to keep the transition hidden from the app in all situations.
The tcp protocol overhead, particularly when mixed with rxm, can be reduced to increase the message rate for small messages. The tcp provider would need to support the existing protocol for compatibility, so some conversion function would be needed. For example, replace the current bswap_hdr() call with a new convert_hdr() call that handles both byte swapping and converting from v4->v3. Unfortunately, the existing CM protocol fails requests for unknown versions, so there can't be an easy fallback mechanism. I.e. we can't ask for v4, and have the peer return that it only knows v3, so that we can fallback gracefully. It's unlikely apps are checking the protocol field in the ep_attr, so an environment variable may be needed to set whether v3 or v4 is preferred.
These are proposed changes for an updated protocol (from a patch that in progress, but far from ready):
The base header sizes are 4, 8, and 16 bytes.