Open agnesnatasya opened 2 years ago
I think this is likely due to Assise not finding the proper interface. Can you change rdma_intf
at rpc_interface.h#L24 to your RDMA network interface name and rebuild? I presume in your case that should be enp65s0f0
.
Hi Waleed,
Thank you very much for your help, I set rdma_intf = enp65s0f0
on both nodes, and I changed the utils/rdma_setup.sh
from ib0
to enp65s0f0
too but it still seg faults.
The error message is a little bit different, on the 10.10.1.3
node, it says
initialize file system
dev-dax engine is initialized: dev_path /dev/dax0.0 size 8192 MB
Reading root inode with inum: 1fetching node's IP address..
Process pid is 19046
ip address on interface 'enp65s0f0' is 10.10.1.3
cluster settings:
--- node 0 - ip:10.10.1.2
--- node 1 - ip:10.10.1.3
./run.sh: line 15: 19046 Segmentation fault LD_LIBRARY_PATH=../build:../../libfs/lib/nvml/src/nondebug/ LD_PRELOAD=../../libfs/lib/jemalloc-4.5.0/lib/libjemalloc.so.2 MLFS_PROFILE=1 numactl -N0 -m0 $@
On the 10.10.1.2 node it says
initialize file system
dev-dax engine is initialized: dev_path /dev/dax0.0 size 8192 MB
Reading root inode with inum: 1fetching node's IP address..
Process pid is 9886
ip address on interface 'enp65s0f0' is 10.10.1.2
cluster settings:
--- node 0 - ip:10.10.1.2
--- node 1 - ip:10.10.1.3
Connecting to KernFS instance 1 [ip: 10.10.1.3]
./run.sh: line 15: 9886 Segmentation fault LD_LIBRARY_PATH=../build:../../libfs/lib/nvml/src/nondebug/ LD_PRELOAD=../../libfs/lib/jemalloc-4.5.0/lib/libjemalloc.so.2 MLFS_PROFILE=1 numactl -N0 -m0 $@
There is an additional Connecting to KernFS instance 1 [ip: 10.10.1.3]
.
Through GDB, it also looks like the rdma_cm_id
struct is still NULL when rdma_bind_addr
or rdma_resolve_addr
is called.
The values of the other seen variables are as follows
add_connection (ip=0x7ffff5335124 "10.10.1.3", port=0x7ffff521f010 "12345", app_type=0, pid=0, ch_type=<optimized out>, polling_loop=1)
and
addr= {sin6_family = 10, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}
Do you happen to know what is the cause of this problem? Does it have something to do with connecting to port on the other node? I have allowed port 12345 on both nodes.
Thank you very much for your help!
Thanks for the debugging effort! I suspect this is likely a firewall issue.
To test connectivity, you can try running the RPC application in lib/rdma/tests/
and see if it also produces an error. You can use the following commands: ./rpc_client <ip> <port> <iters>
and ./rpc_server <port>
. I've added additional checks to libfs/lib/rdma/agent.c
to avoid segfaults; the error codes might help indicate the issue.
Hi Waleed,
Thank you very much for the checks in libfs/lib/rdma/agent.c
! After running using the new version, I received an error code 19, it looks like Assise is unable to find the device.
Here are some of my debugging effort
I traced again using GDB, I found out that rdma_event_channel ec
is NULL when rdma_create_id()
is called, which I suspect might be the reason why rdma_create_id()
failed. After that call, the returned result is -1, error code is 19 and rdma_cm_id = NULL
libfs/lib/rdma-core/librdmacm/cma.c
's rdma_create_event_channel()
function
/dev/infiniband/rdma_cm
to /dev/dax0.0
(the name of the DAX in my machine)make clean
and hence not rebuilt during cd deps; ./install_deps.sh; cd ..
. However, I do check the libfs/lib/rdma-core/build
and it's properly rebuilt, so I'm not too sure what's the cause of it not showing my newest change tot he code. LD_PRELOAD
variable. Is it supposed to be LD_PRELOAD=../../libfs/lib/jemalloc-4.5.0/lib/libjemalloc.so.2
or LD_PRELOAD="../../libfs/lib/jemalloc-4.5.0/lib/libjemalloc.so.2 ../../libfs/build/libmlfs.so"
?I also thought of another point of failure, which is sockaddr_in6 addr
, which is a IPv6 socket, while the IP that I provide in rpc_interface.h
is IPv4. However, I think this is not the problem that causes rdma_create_id()
to fail because this function does not use the variable addr
Regarding your previous suggestion on firewall, it was a great suggestion, thank you! I realised the firewall enabled was on a different network interface. I've enabled incoming and outgoing to and from port 12345 for both nodes on the network interface used by the RDMA enp65s0f0
in my case, but I still receive the above error.
I am also using an NVM emulation instead of an actual NVM.
Do you have any idea regarding the above error? Thank you very much for your help!
I assume you weren't able to run the RPC test. If so, then the error is not Assise-related. The LD_PRELOAD or use of emulated NVM shouldn't be a factor here.
I haven't encountered this particular error myself but, if I had to guess, it could simply be a driver issue. It might make sense to first check whether the MLNX_OFED drivers are properly installed and that the required modules are loaded in your kernel (e.g. libmlx5, libmlx4). That could be the culprit. If that doesn't help, you can try posting this on the Mellanox community forums.
Hi Waleed,
Thank you very much, it was indeed the error, I did not have my RDMA set up yet, I was not aware about it during the setup. Do you mind if I add a sentence or two mentioning that properly configured RDMA device and interfaces is a prerequisite?
Thanks for confirming.
Do you mind if I add a sentence or two mentioning that properly configured RDMA device and interfaces is a prerequisite?
Absolutely! The README can definitely benefit from this. Feel free to do a pull request and I'll merge.
Thank you Waleed for that!
Do you mind if I clarify some things with regards to Assise to help me write a proper additional setup instruction?
rpc_interface.h
hot_replicas[]
are hot replicas, hence there is no need to set up a separate cluster manager?hot_replicas[]
part of all the other nodes replication chain since there is no cluster manager Thank you very much Waleed for your kind help in clarifying about this!
Sorry for the delayed reply! Last few weeks were hectic.
- I assume that the KernFS in this repository is equivalent to the SharedFS in the original paper. Is this correct?
Yes, that's correct.
- I am a little bit confused why there isn't a cluster manager in this Github setup. Is it because this prototype only supports hot replicas, and that every nodes defined in
rpc_interface.h
hot_replicas[]
are hot replicas, hence there is no need to set up a separate cluster manager?
Our prototype currently doesn't come with an interface to the cluster manager (zookeeper). Only hot replicas, as you noted, are supported as of now.
- Is all nodes part of the all the other nodes' replication chain in the general workload setup? Or is this supposed to be determined by the cluster manager's policy? If my assumptions on question 2 is correct, is all nodes in
hot_replicas[]
part of all the other nodes replication chain since there is no cluster managerThank you very much Waleed for your kind help in clarifying about this!
Correct, all nodes defined in hot_replicas
are part of the same replica group.
Thanks a lot Waleed for the clarification!
Hi Waleed,
Thank you very much, it was indeed the error, I did not have my RDMA set up yet, I was not aware about it during the setup. Do you mind if I add a sentence or two mentioning that properly configured RDMA device and interfaces is a prerequisite?
@agnesnatasya
Hi, I met the same problem of segmentation fault, and I found that it seems to be caused by rdma_cm_id = NULL
. Could you please tell more details about your solution of setting up RDMA? Thanks a lot~
Hi @caposerenity! Sure! For me, I have a lab cluster that has Mellanox adapter installed on it, and also the Infiniband drivers installed. I use that to establish the RDMA connection between the nodes. If you have machines with Mellanox adapter installed, but not the drivers, you can try installing the driver through some online guides, depending on the version of the device, one of the documentations is here https://network.nvidia.com/related-docs/prod_software/Mellanox_IB_OFED_Driver_for_VMware_vSphere_User_Manual_Rev_1_8_1.pdf, but you can also find more casual tutorials online. If you do not have machines with Mellanox adapter, I am not sure if there is a workaround. You can definitely run a single node Assise, which is similar to Strata (a local filesystem).
Hi,
Setup
I am trying to set up a simple cluster with 2 nodes. These are the network interfaces of each node:
In each of these node, I set
g_n_hot_rep
to 2 and RPC interface toI run KernFS starting from the node that has 10.10.1.3 as its interface.
Result
I received a segmentation fault
Debugging
After debugging, it looks like the segmentation fault comes in
libfs/lib/rdma/agent.c
line 96 and line 130, the rdma_cm_id struct after rdma_create_id isNULL
. I also run the filesystem as a local file system, whereg_n_hot_rep = 1
and RPC interface is set to localhost, and it worksDo you mind helping me with this problem? Thank you very much!