intel / openlldp

Other
54 stars 42 forks source link

Help with receive buffer overflow? #67

Closed gonzoleeman closed 3 years ago

gonzoleeman commented 3 years ago

I have a user that is having issues with receive buffer overflow. They have a lot of sockets, and when they get a flood of activity on them, they get messages like:

Apr 21 06:00:00 ncn-w001 lldpad[2489]: recvfrom(Event interface): No buffer space available
Apr 21 06:06:10 ncn-w001 lldpad[2489]: recvfrom(Event interface): No buffer space available
Apr 21 06:19:26 ncn-w001 lldpad[2489]: recvfrom(Event interface): No buffer space available
Apr 21 06:19:36 ncn-w001 lldpad[2489]: recvfrom(Event interface): No buffer space available
Apr 21 06:19:53 ncn-w001 lldpad[2489]: recvfrom(Event interface): No buffer space available
Apr 21 06:56:03 ncn-w001 lldpad[2489]: recvfrom(Event interface): No buffer space available

Turns out the internal buffer is 4k.

Testing has shown that setting the receive buffer size to 16k seems to fix this issue, i.e.:

sh# echo 16384 > /proc/sys/net/core/rmem_max

Which is strange, because the default amount seems to actually be larger, i.e.:

sh# cat /proc/sys/net/core/rmem_max    
212992

I'm at a loss to understand why a smaller receive buffer size helps this. Perhaps it keeps the system from flooding the daemon? Also, we did not modify "rmem_default".

Is this a common problem? In other words, is it something that the daemon can run into, in general, with a large number of sockets, or is it something unique (or misconfigured) in this person's setup?

I can submit a pull request that adjusts this limit in lldpad:main(), but not sure that's the right answer.

Thanks.

apconole commented 3 years ago

See issue #45 - the customer I was working with became unresponsive and seems wasn't willing to test my proposed config patch.

If you think it makes sense, maybe I can officially propose it?

gonzoleeman commented 3 years ago

Okay, I will see if my customer can try another fix. That may be asking a lot, since the problem is "fixed" in their eyes.

BTW, looking at your code, is "nlsock_bufsz" a new tuning parameter you would be adding for openlldp?

apconole commented 3 years ago

Yes, defaulted to '0'. The initial value comes from hostapd, which is what openlldp used as a starting point for code. It makes a lot of sense on a SoC or low memory environment, to keep pressure on the kernel buffer small. In larger enterprise deployments, it doesn't make as much sense. I left it as a knob, but I'm not even convinced we should have it. After all, it can be adjusted system-wide (which is where I would expect such adjustments to take place anyway). I can be convinced that it should just be dropped completely, and we should nuke the bufsz assignments altogether.

gonzoleeman commented 3 years ago

I think for now I'll ask our end user to try your patch/commit (referenced above). Perhaps just removing the knob would be a good future change?

apconole commented 3 years ago

Any updates?

gonzoleeman commented 3 years ago

After discussing this with Hannes (who has worked on openlldp extensively), I believe there is no reason to modify the receive buffer size, unless it is too small. That seems unlikely possible.

What about the following commit, from my cloned repo, branch fix-rcvbuf-issue? Please comment ...

From 9846543830aafd1a966e26f16b73a5f0e2126419 Mon Sep 17 00:00:00 2001
From: Lee Duncan <lduncan@suse.com>
Date: Tue, 8 Dec 2020 15:54:25 -0800
Subject: [PATCH] event_iface: only set rcv buf size if too small

Instead of always setting the receive buffer size
to a small 8K, which causes problems when a flood of
netlink messages are received, set it to a
"minimal" value only if it is currently less
than that value.

The value used is 32 x the MAX_PAYLOAD size
of 4k, i.e. 128k.

A config value to modify this can be added if needed,
but shouldn't be necessary.
---
 event_iface.c       | 16 ++++++++++++++--
 include/qbg_vdpnl.h |  1 +
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/event_iface.c b/event_iface.c
index 1be29637bb68..b9e6002be061 100644
--- a/event_iface.c
+++ b/event_iface.c
@@ -418,7 +418,8 @@ event_iface_receive(int sock, UNUSED void *eloop_ctx, UNUSED void *sock_ctx)
 int event_iface_init()
 {
        int fd;
-       int rcv_size = MAX_PAYLOAD;
+       int rcv_size = 0;
+       socklen_t rcv_len = sizeof(int);
        struct sockaddr_nl snl;

        fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
@@ -426,10 +427,21 @@ int event_iface_init()
        if (fd < 0)
                return fd;

-       if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcv_size, sizeof(int)) < 0) {
+       /* is receive buffer size too small? */
+       if (getsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcv_size, &rcv_len) < 0) {
                close(fd);
+               LLDPAD_DBG("%s(%i): getsockopt=%d.\n", __func__, __LINE__, errno);
                return -EIO;
        }
+       if (rcv_size < MIN_RCVBUF_SIZE) {
+               rcv_size = MIN_RCVBUF_SIZE >> 1;        /* we get back 2x what we set */
+               if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcv_size, rcv_len) < 0) {
+                       LLDPAD_DBG("%s(%i): setsockopt=%d.\n", __func__, __LINE__, errno);
+                       close(fd);
+                       return -EIO;
+               }
+               LLDPAD_DBG("%s(%i): set rcvbuf=%d.\n", __func__, __LINE__, rcv_size);
+       }

        memset((void *)&snl, 0, sizeof(struct sockaddr_nl));
        snl.nl_family = AF_NETLINK;
diff --git a/include/qbg_vdpnl.h b/include/qbg_vdpnl.h
index cb7efcaa06c5..a9369d9f04a4 100644
--- a/include/qbg_vdpnl.h
+++ b/include/qbg_vdpnl.h
@@ -33,6 +33,7 @@
 #include <linux/if_ether.h>

 #define        MAX_PAYLOAD     4096    /* Maximum Payload Size */
+#define        MIN_RCVBUF_SIZE (MAX_PAYLOAD << 5)      /* SO_RCVBUF min */

 enum {
        vdpnl_nlf1 = 1,         /* Netlink message format 1 (draft 0.2) */
-- 
2.29.2
gonzoleeman commented 3 years ago

NOTE that the system max and default receive buffer size seems to be 22k on my systems, and I suspect it is similarly large elsewhere. One question: should it just be a warning if the setsockopt() fails?

gonzoleeman commented 3 years ago

One other note: this has been compile-tested only.

apconole commented 3 years ago

I don't think we need to worry about logging a warning (or even debug message) here. If it fails, we abort with an error message in main(). Maybe we can update the error message or details that get returned to lldpad main() function which calls the register (which would be a good change to help debug when things fail), but that should be a separate commit. Go ahead and propose the (minimally tested) patch in a pull request. Thanks, Lee!

gonzoleeman commented 3 years ago

On Dec 9, 2020, at 9:18 AM, Aaron Conole notifications@github.com wrote:

I don't think we need to worry about logging a warning (or even debug message) here. If it fails, we abort with an error message in main(). Maybe we can update the error message or details that get returned to lldpad main() function which calls the register (which would be a good change to help debug when things fail), but that should be a separate commit. Go ahead and propose the (minimally tested) patch in a pull request. Thanks, Lee!

I will resubmit without the debug/log messages. Can I assume the rest is ok?

— Lee

apconole commented 3 years ago

Yes. Please submit.

gonzoleeman commented 3 years ago

See pull request #70

apconole commented 3 years ago

Pull request is merged to branch-1.1 and master. Thanks!