Open glock42 opened 3 years ago
Looks like you might be out of memory (the size for the failing mr#0 is 4GB in your output). 4KB is too small for good performance. A minimum of 16MB is required for good performance.
Waleed, can you confirm?
On Sun, Oct 17, 2021 at 12:24 AM Jian @.***> wrote:
I'm using two nodes to run Assise.
I got the error ibv_reg_mr failed [error code: 12]. ibv_reg_mr can't work with large size.
initialize file system dev-dax engine is initialized: dev_path /dev/dax0.0 size 49152 MB Reading root inode with inum: 1fetching node's IP address.. Process pid is 34400 ip address on interface 'ib0' is 10.10.1.3 cluster settings: --- node 0 - ip:10.10.1.3 --- node 1 - ip:10.10.1.2 Connecting to KernFS instance 1 [ip: 10.10.1.2] [RDMA-Client] Creating connection (pid:0, app_type:0, status:pending) to 10.10.1.2:12345 on sockfd 0 [RDMA-Client] Creating connection (pid:0, app_type:1, status:pending) to 10.10.1.2:12345 on sockfd 1 [RDMA-Client] Creating connection (pid:0, app_type:2, status:pending) to 10.10.1.2:12345 on sockfd 2 [RDMA-Server] Listening on port 12345 for connections. interrupt (^C) to exit. creating background thread to poll completions (blocking) test register memory registering mr #0 with addr:140431182528512 and size:4299161600 registeration failed with errno: Cannot allocate memory ibv_reg_mr failed [error code: 12]
I keep reducing the g_log_size to 4096, then it works.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ut-osa/assise/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHQBMVT5QMPXM4NINI5W5LUHJMXRANCNFSM5GENFOWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Sorry for jumping in a bit late.
As Simon mentioned, you will likely need to allocate at least 16 MB of log space to get decent performance.
The error you're encountering is raised by the RDMA device driver (so it's not Assise-related). It seems that it's unable to pin larger memory regions. This limit might be imposed by your OS' max locked memory parameter. I'd double-check first that its value is large enough (by running ulimit -l
), and increase as needed.
I'm using two nodes to run Assise and use 60G DRAM to emulate NVM.
I got the error
ibv_reg_mr failed [error code: 12]
. This is becauseibv_reg_mr
can't work with a large size.I keep reducing the g_log_size to 4096, then it works. Not sure how can I use a larger log size?
Thanks