bytecodealliance / wasm-micro-runtime

WebAssembly Micro Runtime (WAMR)
Apache License 2.0
4.84k stars 618 forks source link

Improve mmap retry logic. #3714

Closed sjamesr closed 1 month ago

sjamesr commented 1 month ago
wenyongh commented 1 month ago
  • Only retry on EAGAIN or EINTR.
  • On EINTR, don't count it against the retry budget. EINTR can happen in bursts.
  • Log the errno on failure, and don't conditionalize that logging on BH_ENABLE_TRACE_MMAP. In other parts of the code, error logging is not conditional on that define, while turning on that tracing define makes things overly verbose.

@sjamesr Sorry that I didn't read the code carefully at the first time, now I read it again and am a little confused: (1) will the code if (errno == EINTR) continue cause function os_mmap run into dead loop? For example, the system keeps in bursts for a long time? (2) now it only retries on EAGAIN or EINTR, will it lose some opportunities to make mmap success? For example, the first time mmap returns error like EINVAL/ENOMEM, but the second time mmap returns success?

yamt commented 1 month ago

Only retry on EAGAIN or EINTR.

On EINTR, don't count it against the retry budget. EINTR can happen in bursts.

can this mmap fail with EAGAIN or EINTR? on which platform?

are you trying to fix a real problem? or something theoretical?

yamt commented 1 month ago

honestly speaking, i don't understand why we bother to retry on mmap failure here at all. @wenyongh do you remember the reason?

wenyongh commented 1 month ago

honestly speaking, i don't understand why we bother to retry on mmap failure here at all. @wenyongh do you remember the reason?

In my memory, we retry 5 times here just to make mmap have more opportunity to succeed. But it may be slow if we retry unconditionally or without checking the errno, so this PR tries to improve it, it mmaps again only when the errno is EINTR or EAGAIN.

yamt commented 1 month ago

honestly speaking, i don't understand why we bother to retry on mmap failure here at all. @wenyongh do you remember the reason?

In my memory, we retry 5 times here just to make mmap have more opportunity to succeed.

did it actually improve the chance of success? how?

But it may be slow if we retry unconditionally or without checking the errno,

when mmap fails, i expect it fails immediately. i don't see any needs to "improve" the retry logic.

so this PR tries to improve it, it mmaps again only when the errno is EINTR or EAGAIN.

does mmap return EINTR/EAGAIN? on which platforms?

wenyongh commented 1 month ago

honestly speaking, i don't understand why we bother to retry on mmap failure here at all. @wenyongh do you remember the reason?

In my memory, we retry 5 times here just to make mmap have more opportunity to succeed.

did it actually improve the chance of success? how?

yes, the memory may be occupied by other processes or threads temporarily and freed when retrying, e.g., when MAP_FAILED is returned and the errno is:

EAGAIN The file has been locked, or too much memory has been locked (see setrlimit(2)).
ENOMEM No memory is available.
ENOMEM The process's maximum number of mappings would have been exceeded.

@sjamesr also pointed that the errno can be EINTR which happens in bursts.

But it may be slow if we retry unconditionally or without checking the errno,

when mmap fails, i expect it fails immediately. i don't see any needs to "improve" the retry logic.

Here the PR tries to reduce the retry count so as to return failure more quickly.

so this PR tries to improve it, it mmaps again only when the errno is EINTR or EAGAIN.

does mmap return EINTR/EAGAIN? on which platforms?

I can find errno may be EAGAIN in the linux mmap manual: https://man7.org/linux/man-pages/man2/mmap.2.html EINTR is mentioned by @sjamesr. @sjamesr do you know which platform may set errno to EINTR?

yamt commented 1 month ago

honestly speaking, i don't understand why we bother to retry on mmap failure here at all. @wenyongh do you remember the reason?

In my memory, we retry 5 times here just to make mmap have more opportunity to succeed.

did it actually improve the chance of success? how?

yes, the memory may be occupied by other processes or threads temporarily and freed when retrying, e.g., when MAP_FAILED is returned and the errno is:

EAGAIN The file has been locked, or too much memory has been locked (see setrlimit(2)).
ENOMEM No memory is available.
ENOMEM The process's maximum number of mappings would have been exceeded.

well, other threads free the resource while we are making several system calls? it's certainly possible, but quite unlikely i guess.

also, with that logic, we should retry many of other operations, including open(), malloc(), etc, shouldn't we? why mmap is special?

also, if we want to give other threads to free resources, i guess it makes more sense to sleep a bit before making a retry.

also, with that logic, isn't this PR a regression because we won't retry on ENOMEM anymore?

But it may be slow if we retry unconditionally or without checking the errno,

when mmap fails, i expect it fails immediately. i don't see any needs to "improve" the retry logic.

Here the PR tries to reduce the retry count so as to return failure more quickly.

is it worth to save a few system calls here?

so this PR tries to improve it, it mmaps again only when the errno is EINTR or EAGAIN.

does mmap return EINTR/EAGAIN? on which platforms?

I can find errno may be EAGAIN in the linux mmap manual: https://man7.org/linux/man-pages/man2/mmap.2.html EINTR is mentioned by @sjamesr. @sjamesr do you know which platform may set errno to EINTR?

EAGAIN is about locked memory, right? i can't imagine situations where wamr is used with mlockall. is there such a use case?

wenyongh commented 1 month ago

honestly speaking, i don't understand why we bother to retry on mmap failure here at all. @wenyongh do you remember the reason?

In my memory, we retry 5 times here just to make mmap have more opportunity to succeed.

did it actually improve the chance of success? how?

yes, the memory may be occupied by other processes or threads temporarily and freed when retrying, e.g., when MAP_FAILED is returned and the errno is:

EAGAIN The file has been locked, or too much memory has been locked (see setrlimit(2)).
ENOMEM No memory is available.
ENOMEM The process's maximum number of mappings would have been exceeded.

well, other threads free the resource while we are making several system calls? it's certainly possible, but quite unlikely i guess.

also, with that logic, we should retry many of other operations, including open(), malloc(), etc, shouldn't we? why mmap is special?

As you know, there may be one or some extra requirements/flags for mmap, e.g., MAP_32BIT, PROT_EXEC, MADV_HUGEPAGE and mmap with 8GB size for linear memory when hw bound check is enabled, per my understanding, it is easier to return failure than other operations, so we let it retry several times. BTW, for memory64, the request size to mmap may be much more larger.

also, if we want to give other threads to free resources, i guess it makes more sense to sleep a bit before making a retry.

Yes, sleeping a bit before retry seems better.

also, with that logic, isn't this PR a regression because we won't retry on ENOMEM anymore?

Yes, I also asked the issue https://github.com/bytecodealliance/wasm-micro-runtime/pull/3714#issuecomment-2292597062

when mmap fails, i expect it fails immediately. i don't see any needs to "improve" the retry logic.

Here the PR tries to reduce the retry count so as to return failure more quickly.

is it worth to save a few system calls here?

I am not sure, I guess it can save some time.

I can find errno may be EAGAIN in the linux mmap manual: https://man7.org/linux/man-pages/man2/mmap.2.html EINTR is mentioned by @sjamesr. @sjamesr do you know which platform may set errno to EINTR?

EAGAIN is about locked memory, right? i can't imagine situations where wamr is used with mlockall. is there such a use case?

I am not familiar with the mlock, I guess wamr's mmap isn't used in that situation, but not sure whether there are other processes using it. Here is the answer of "when will mmap return EAGAIN" getting from ChatGPT:

The mmap system call might return the EAGAIN error code under certain conditions, typically indicating that the operation could not be completed immediately and should be retried later. Here are some scenarios where mmap might return EAGAIN:

Resource Shortage: If the system doesn't have enough memory or address space available at the moment to fulfill the mmap request, it may return EAGAIN. This can occur in systems with high memory usage or when the address space is highly fragmented.

File Locking: If you try to map a file that is locked by another process in a way that conflicts with the mapping operation (e.g., the file is locked exclusively), mmap might return EAGAIN.

I/O Operations: On some systems, if mmap requires an I/O operation that cannot be completed immediately and would block the process, it might return EAGAIN. This is more common when the underlying device or file system is under heavy load.

Process or System Limits: If the process or the system as a whole has reached certain limits, such as the maximum number of memory mappings or open file descriptors, mmap might fail with EAGAIN.

In general, if you encounter EAGAIN, you may need to retry the mmap operation after a short delay or under different conditions, depending on the exact cause of the error.
yamt commented 1 month ago

As you know, there may be one or some extra requirements/flags for mmap, e.g., MAP_32BIT, PROT_EXEC, MADV_HUGEPAGE and mmap with 8GB size for linear memory when hw bound check is enabled, per my understanding, it is easier to return failure than other operations, so we let it retry several times. BTW, for memory64, the request size to mmap may be much more larger.

yes, mmap can fail. but my impression is that it's almost meaningless to repeat the same mmap request again and again.

MADV_HUGEPAGE might have a system-level limit. i dunno if it ends up with EAGAIN or some other errors. anyway, if mmap with MADV_HUGEPAGE fails, it makes more sense to fall back to mmap w/o MADV_HUGEPAGE i guess.

I am not familiar with the mlock, I guess wamr's mmap isn't used in that situation, but not sure whether there are other processes using it.

typically there are system-level limit and process-level limit for locked memory. i don't think they matter as far as wamr's process doesn't use mlockall though.

Here is the answer of "when will mmap return EAGAIN" getting from ChatGPT:

sorry, i don't trust chatgpt in general.

my points are:

wenyongh commented 1 month ago

Yes, thanks for the feedback, I am not so sure whether the retrying will help a lot but I think we had better keep it, since it doesn't hurt much but it would be good if it helps, and as you see, this PR even keeps retrying when the errno is EINTR. And maybe we can apply some enhancements for it.

sjamesr commented 1 month ago

@yamt The motivation for this change was to work around a very rare test failure we (Google) were observing with wamr on Google's production Linux. I'm not the original author of the change, and I've been unable to reproduce the test failure without this patch. The original author said EINTR can "occur in bursts", I'm still trying to track down in our version of Linux whether/how mmap can return EINTR.

yamt commented 1 month ago

@yamt The motivation for this change was to work around a very rare test failure we (Google) were observing with wamr on Google's production Linux. I'm not the original author of the change, and I've been unable to reproduce the test failure without this patch. The original author said EINTR can "occur in bursts", I'm still trying to track down in our version of Linux whether/how mmap can return EINTR.

does it involve oom killing? anyway, my impression is that mmap returning EINTR (up to the userspace) is an os bug.