ut-osa / assise

GNU General Public License v2.0
57 stars 30 forks source link

registeration memory failed with errno: Cannot allocate memory #12

Open glock42 opened 3 years ago

glock42 commented 3 years ago

I'm using two nodes to run Assise and use 60G DRAM to emulate NVM.

I got the error ibv_reg_mr failed [error code: 12]. This is because ibv_reg_mr can't work with a large size.

initialize file system
dev-dax engine is initialized: dev_path /dev/dax0.0 size 49152 MB
Reading root inode with inum: 1fetching node's IP address..
Process pid is 34400
ip address on interface 'ib0' is 10.10.1.3
cluster settings:
--- node 0 - ip:10.10.1.3
--- node 1 - ip:10.10.1.2
Connecting to KernFS instance 1 [ip: 10.10.1.2]
[RDMA-Client] Creating connection (pid:0, app_type:0, status:pending) to 10.10.1.2:12345 on sockfd 0
[RDMA-Client] Creating connection (pid:0, app_type:1, status:pending) to 10.10.1.2:12345 on sockfd 1
[RDMA-Client] Creating connection (pid:0, app_type:2, status:pending) to 10.10.1.2:12345 on sockfd 2
[RDMA-Server] Listening on port 12345 for connections. interrupt (^C) to exit.
creating background thread to poll completions (blocking) test
register memory
registering mr #0 with addr:140431182528512 and size:4299161600
registeration failed with errno: Cannot allocate memory
ibv_reg_mr failed [error code: 12]

I keep reducing the g_log_size to 4096, then it works. Not sure how can I use a larger log size?

Thanks

simpeter commented 3 years ago

Looks like you might be out of memory (the size for the failing mr#0 is 4GB in your output). 4KB is too small for good performance. A minimum of 16MB is required for good performance.

Waleed, can you confirm?

On Sun, Oct 17, 2021 at 12:24 AM Jian @.***> wrote:

I'm using two nodes to run Assise.

I got the error ibv_reg_mr failed [error code: 12]. ibv_reg_mr can't work with large size.

initialize file system dev-dax engine is initialized: dev_path /dev/dax0.0 size 49152 MB Reading root inode with inum: 1fetching node's IP address.. Process pid is 34400 ip address on interface 'ib0' is 10.10.1.3 cluster settings: --- node 0 - ip:10.10.1.3 --- node 1 - ip:10.10.1.2 Connecting to KernFS instance 1 [ip: 10.10.1.2] [RDMA-Client] Creating connection (pid:0, app_type:0, status:pending) to 10.10.1.2:12345 on sockfd 0 [RDMA-Client] Creating connection (pid:0, app_type:1, status:pending) to 10.10.1.2:12345 on sockfd 1 [RDMA-Client] Creating connection (pid:0, app_type:2, status:pending) to 10.10.1.2:12345 on sockfd 2 [RDMA-Server] Listening on port 12345 for connections. interrupt (^C) to exit. creating background thread to poll completions (blocking) test register memory registering mr #0 with addr:140431182528512 and size:4299161600 registeration failed with errno: Cannot allocate memory ibv_reg_mr failed [error code: 12]

I keep reducing the g_log_size to 4096, then it works.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ut-osa/assise/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHQBMVT5QMPXM4NINI5W5LUHJMXRANCNFSM5GENFOWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

wreda commented 3 years ago

Sorry for jumping in a bit late.

As Simon mentioned, you will likely need to allocate at least 16 MB of log space to get decent performance.

The error you're encountering is raised by the RDMA device driver (so it's not Assise-related). It seems that it's unable to pin larger memory regions. This limit might be imposed by your OS' max locked memory parameter. I'd double-check first that its value is large enough (by running ulimit -l), and increase as needed.