Closed William-Mou closed 1 year ago
Here is my environment information:
I did not encounter your error, I checked the rdma_getaddrinfo
function of my header file <rdmalib/rdmalib.hpp>
, its first parameter is const char *node
, not the char* node
shown in your error , I think this may be because your rdma version is too low, you can try to upgrade it.
@Origin-yy you are right!
Since I am using a Mellanox CX3 NIC, I can only use MLNX_OFED
version 4.7 or earlier. (actually, the version is 41mlnx1-OFED.4.7.3.0.6.49417
)
I have read the git blame
record, and the function declaration before version 12 is different from the current version.
Perhaps we can add a parameter in the CMakeList to allow users to customize the version.
Hi @mcopik, Is it a good idea? Thank you!
@William-Mou @Origin-yy I think we need to support both variants - as @William-Mou already noticed, users can be stuck to a specific version of this library. Furthermore, we target HPC clusters and updating dependencies there might take a lot of work. Updating spdlog
is simple; updating librdmacm
is not :(
Since we are interfacing with C API that could not accept const char*
and will not modify the data, this could be the perfect example of using the forgotten const_cast
. What do you think?
You can also parameterize the solution depending on the library version - up to you :-)
Hi @mcopik,
Thanks for your response! As you mentioned, upgrading librdmacm
on an HPC cluster is a challenging task.
To make the code work properly on different versions, I proposed three ideas:
const_cast
, which is the approach implemented in this PR.return const_cast<char *>(static_cast<char *>(this.ip);
.
method 3 is referenced to "Item 3, Use const whenever possible," on p. 24 of Effective C++, 3d ed by Scott Meyers.
For method 2: I attempted to parameterize the solution using rdmacm_VERSION
in CMake to determine the librdmacm
version. However, I encountered an issue: on server A, the header file is declared as non-const
for version 1.3.41.0, while on server B, it is declared as const
for version 1.2.28.0 (Shown in the image below). This discrepancy may be due to server A using MLNX_OFED installation, whereas server B is using APT installation.
If we cannot assert the declaration based on the version, method 2 may not be feasible. And, I believe that the third method is unnecessary due to readability issues :D
All things considered, I have reopened this PR, and the current code should work on all versions of librdmacm
.
@William-Mou LGTM, thank you for your help!
Introduction
This commit cast the ip address and port number to
char *
in the call tordma_getaddrinfo
function. This is to avoidinvalid conversion
error and to ensure that the correct arguments are passed to it.Error Message:
Here is my environment information:
When I followed the readme instructions for the compilation, I encounter following error.