versatica / mediasoup

Cutting Edge WebRTC Video Conferencing
https://mediasoup.org
ISC License
6.18k stars 1.12k forks source link

liburing zero copy #1273

Closed jmillan closed 9 months ago

jmillan commented 9 months ago

By pre-registering the send buffers into io_uring we can make use of zero copy in order to avoid the kernel memcpy-ing the buffers.

Send buffers are allocated by us and will be set as available for another use once liburing notifies us about it (once the buffers are in the network cards queue).

In my testing environment this provides a ~=2% CPU improvement (1 Router, 1 A/V Producer, 40 A/V Consumers).

NOTE: This is NOT applied to TCP on purpose as the sending buffer release in TCP depends on the reception of ACK from the client, and we don't want our buffer usage to depend on client's delay.

NOTE: Discussion in liburing about the CI issue.

ibc commented 9 months ago

This is crashing in CI: https://github.com/versatica/mediasoup/actions/runs/7264289536/job/19791498040?pr=1273

jmillan commented 9 months ago

NOTE for me: Current CI errors on ubuntu 22.04:

failure exit: io_uring_register_buffers() failed: Cannot allocate memory

jmillan commented 9 months ago

failure exit: io_uring_register_buffers() failed: Cannot allocate memory

This is related to the memory available for the process. From io_uring_register_buffers() man:

The buffers associated with the iovecs will be locked in memory and charged against the user's RLIMIT_MEMLOCK resource limit

We are allocating (4 * 1024) buffers of 1500B each, meaning ~= 5MB of memory for sending buffers. That is fairly low for a server.

This error is happening in ubuntu-22.04 only, and not in previous versions so I wonder if they've changed the RLIMIT_MEMLOCK values for the image :-/

jmillan commented 9 months ago

I don't foresee a solution for this issue.., any ideas @ibc, @nazar-pc ?

EDIT: I've edited the PR description with a link to liburing discussion about this.

nazar-pc commented 9 months ago

Are you sure this is about the image? You should be able to change those limits if you need to. But either way zero copy is not magic, memory for in-flight messages still has to be allocated somewhere. Whether 2% difference is worth the complexity and trickiness I'm not sure.

jmillan commented 9 months ago

Are you sure this is about the image? You should be able to change those limits if you need to. But either way zero copy is not magic, memory for in-flight messages still has to be allocated somewhere. Whether 2% difference is worth the complexity and trickiness I'm not sure.

We are already allocating such memory for liburing usage as it's all about sending multiple packets in batches. This PR is just telling liburing to keep the buffers untill put in the network card queue instead of coying them, nothing else.

We can just disable ZC if the iovecs cannot be registered, and use the feature otherwise.

ibc commented 9 months ago

This feature should be ready for those errors and react of them. Then we can document that rlimits must be managed and so on for proper usage. Can we try/carch the "no memory" specific error and, if so, skip io-uring usage and fallback to default behavior?

jmillan commented 9 months ago

This feature should be ready for those errors and react of them. Then we can document that rlimits must be managed and so on for proper usage. Can we try/carch the "no memory" specific error and, if so, skip io-uring usage and fallback to default behavior?

Yes, we can disable zero copy if this error happens upon buffer registering. The rest of io-uring does not depend on rlmits.

ibc commented 9 months ago

NOTE for me: Current CI errors on ubuntu 22.04:

failure exit: io_uring_register_buffers() failed: Cannot allocate memory

So this is wrapped in DepLibUring::LibUring::LibUring() constructor:

    err = io_uring_register_buffers(std::addressof(this->ring), this->iovecs, DepLibUring::QueueDepth);

    if (err < 0)
    {
        MS_THROW_ERROR("io_uring_register_buffers() failed: %s", std::strerror(-err));
    }

This means that this is throwing and the caller side doesn't silence the error, which is good. So we should add proper try/catch (I mean check ret error value) exactly in the code above when calling to io_uring_register_buffers() and, only in case the error is about "Cannot allocate memory", then we should disable the corresponding feature and do NOT throw, right?

So we should have some flag and public getter telling whether zero copy is enabled or not and check that flag in the related code, right?

ibc commented 9 months ago

BTW the "Cannot allocate memory" is not a liburing specific error, It's just a standard C++ error. So we should check ret returned by io_uring_register_buffers() and hopefully there will be a specific ret for the case in which the underlying error is "Cannot allocate memory" and, ONLY if that's the error, disable the zero copy feature and NOT throw. Unfortunately it's not easy since AFAIS io_uring_register_buffers() doesn't return error codes specific to C++ errors:

int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs,
                  unsigned nr_iovecs)
{
    return do_register(ring, IORING_REGISTER_BUFFERS, iovecs, nr_iovecs);
}
static inline int do_register(struct io_uring *ring, unsigned int opcode,
                  const void *arg, unsigned int nr_args)
{
    int fd;

    if (ring->int_flags & INT_FLAG_REG_REG_RING) {
        opcode |= IORING_REGISTER_USE_REGISTERED_RING;
        fd = ring->enter_ring_fd;
    } else {
        fd = ring->ring_fd;
    }

    return __sys_io_uring_register(fd, opcode, arg, nr_args);
}
static inline int __sys_io_uring_register(unsigned int fd, unsigned int opcode,
                      const void *arg, unsigned int nr_args)
{
    int ret;
    ret = syscall(__NR_io_uring_register, fd, opcode, arg, nr_args);
    return (ret < 0) ? -errno : ret;
}

Let's see what the hell is ret = syscall(__NR_io_uring_register, fd, opcode, arg, nr_args);:

#ifndef __NR_io_uring_register
#define __NR_io_uring_register      537
#endif
#elif defined __mips__
#ifndef __NR_io_uring_setup
#define __NR_io_uring_setup     (__NR_Linux + 425)
#endif
#ifndef __NR_io_uring_enter
#define __NR_io_uring_enter     (__NR_Linux + 426)
#endif
#ifndef __NR_io_uring_register
#define __NR_io_uring_register      (__NR_Linux + 427)
#endif
#else /* !__alpha__ and !__mips__ */
#ifndef __NR_io_uring_setup
#define __NR_io_uring_setup     425
#endif
#ifndef __NR_io_uring_enter
#define __NR_io_uring_enter     426
#endif
#ifndef __NR_io_uring_register
#define __NR_io_uring_register      427
#endif

Hehe. So our only chance is that ret returned by that ret = syscall(__NR_io_uring_register, fd, opcode, arg, nr_args); is a specific value meaning "cannot allocate memory". Is it? I hope:

https://en.cppreference.com/w/cpp/header/cerrno

BTW cool mini library here that prints macro name of a given errno: https://github.com/mentalisttraceur/errnoname

ibc commented 9 months ago

Or should we just log an error if io_uring_register_buffers() fails (for whatever reason) and disable the zero copy feature if so and keep working without it?

jmillan commented 9 months ago

So we should have some flag and public getter telling whether zero copy is enabled or not and check that flag in the related code, right?

Correct.

Or should we just log an error if io_uring_register_buffers() fails (for whatever reason) and disable the zero copy feature if so and keep working without it?

Just in case the error is code is ENOMEM. NOTE: I need to check the error is exactly this.

ibc commented 9 months ago

Just in case the error is code is ENOMEM. NOTE: I need to check the error is exactly this.

Probably. Just wondering if some other legit errors could happen in some legit scenarios (VMs, AWS, Docker, etc), no idea but may be EACCES as well.

https://en.cppreference.com/w/cpp/header/cerrno

jmillan commented 9 months ago

Let's see the CI output...

ibc commented 9 months ago

Tests are now passing so your ENOMEM check is working fine. I've updated CHANGELOG and made a cosmetic change.

ibc commented 9 months ago

I've merged v3 in this branch with updates ot NPM deps. So please run npm ci locally.

ibc commented 9 months ago

Wow:

  ../../../../../../worker/src/DepLibUring.cpp: In constructor ‘DepLibUring::LibUring::LibUring()’:
  ../../../../../../worker/src/DepLibUring.cpp:316:30: error: invalid pure specifier (only ‘= 0’ is allowed) before ‘err’
    316 |                 int errno = -err;
        |                              ^~~
  In file included from ../../../../../../worker/subprojects/libuv-v1.47.0/include/uv/errno.h:25,
                   from ../../../../../../worker/subprojects/libuv-v1.47.0/include/uv.h:56,
                   from ../../../../../../worker/include/DepLibUV.hpp:5,
                   from ../../../../../../worker/include/DepLibUring.hpp:4,
                   from ../../../../../../worker/src/DepLibUring.cpp:4:
  ../../../../../../worker/src/DepLibUring.cpp:316:21: error: function ‘int* __errno_location()’ is initialized like a variable
    316 |                 int errno = -err;
        |                     ^~~~~
  In file included from ../../../../../../worker/src/DepLibUring.cpp:5:
  ../../../../../../worker/include/Logger.hpp:216:101: warning: too many arguments for format [-Wformat-extra-args]
    216 |                         const int loggerWritten = std::snprintf(Logger::buffer, Logger::bufferSize, "W" _MS_LOG_STR_DESC desc, _MS_LOG_ARG, ##__VA_ARGS__); \
  ../../../../../../worker/src/DepLibUring.cpp:322:25: note: in expansion of macro ‘MS_WARN_TAG’
    322 |                         MS_WARN_TAG(
        |                         ^~~~~~~~~~~

Also wiht this change:

./../../src/DepLibUring.cpp:316:30: error: invalid pure specifier (only ‘= 0’ is allowed) before numeric constant
  316 |                 int errno = -1 * err;
      |                              ^
In file included from ../../../subprojects/libuv-v1.47.0/include/uv/errno.h:25,
                 from ../../../subprojects/libuv-v1.47.0/include/uv.h:56,
                 from ../../../include/DepLibUV.hpp:5,
                 from ../../../include/DepLibUring.hpp:4,
                 from ../../../src/DepLibUring.cpp:4:
../../../src/DepLibUring.cpp:316:21: error: function ‘int* __errno_location()’ is initialized like a variable
  316 |                 int errno = -1 * err;
      |                     ^~~~~
ninja: build stopped: subcommand failed.