facebookincubator / gloo

Collective communications library with various primitives for multi-machine training.
Other
1.23k stars 303 forks source link

Fix wrong struct layout in getInterfaceSpeedGLinkSettings #382

Open Flamefire opened 8 months ago

Flamefire commented 8 months ago

The SIOCETHTOOL IOCTL expects a pointer to an instance of ethtool_link_settings. E.g. it will read the cmd member to determine what to do. However #346 reordered the memory layout of the pointer such that actually and array of __32 values (zeroed out) is passed. Hence the IOCTL will either fail because an invalid command (cmd=0) is passed or the values read later by e.g. ecmd.req.link_mode_masks_nwords are something completely different.

So ethtool_link_settings has to come before the (stack) memory used in the flexible array at the end of this struct. To avoid GCC warnigns/errors (see #345) an union is used that provides the current struct (i.e. with wrong order) and an access the actually used struct ethtool_link_settings at the top.

Alternative approach to https://github.com/facebookincubator/gloo/pull/348 likely avoiding any undefined behavior (union based type punning is usually supported by compilers, so this should work to get the required memory)

Flamefire commented 8 months ago

@kumpera Would you be able to review and merge this? CI seems to be broken/unusable right now.

@cdluminate If you want/can take a look if this works for you too?