Closed sctb512 closed 3 years ago
There's build option for mlx4 devices in shenango/shared.mk
.
Modify it as below and re-build shenango:
CONFIG_MLX5=n
CONFIG_MLX4=y
I'm not sure it's working but you can try.
Thanks for your reply. Before I was able to build successfully, I had modified these places in shenango/shared.mk
.
Before modifying, I would get the error as follows:
iokernel/mlx.h:5:10: fatal error: mlx5_custom.h: No such file or directory
Thanks for your reply. Before I was able to build successfully, I had modified these places in
shenango/shared.mk
. Before modifying, I would get the error as follows:iokernel/mlx.h:5:10: fatal error: mlx5_custom.h: No such file or directory
Sorry for the late reply. The error is caused by the intermediate mlx5 files left by your first compilation with CONFIG_MLX5=y. You can simply clone a new repo from scratch, set CONFIG_MLX5=n & CONFIG_MLX4=y & CONFIG_DIRECTPATH=n, and recompile.
In addition, our project is mostly implemented using mlx5 NIC and only has limited support for mlx4, so you may observe reduced performance with mlx4. When running the program, you have to delete enable_directpath 1
of all config files in AIFM/aifm/configs/
. Please let me know if you have any further questions, I'm happy to answer.
Hello, i try to run this project on my nodes and get the error as follows:
I found this issue happen in the file
./shenango/runtime/net/directpath/mlx5/mlx5_init.c
The value of
dev_list[0]
is NULL:It looks like i can't get device list.
Question 1: Why dev_list[0] is NULL? Is there any way to solve this problem?
Then, I found there is only mlx5 directory in
./shenango/runtime/net/directpath/
but my nodes use mlx4:
If i modify
CONFIG_DIRECTPATH=y
toCONFIG_DIRECTPATH=n
inshared.mk
, the runtime not works.Question 2: Whether there is only mlx5 implementation? If I want to run this project on ConnectX-3 devices, can you give me some advice? (I can't apply for a cloudlab account successful.)
Thanks!