ivmai / bdwgc

The Boehm-Demers-Weiser conservative C/C++ Garbage Collector (bdwgc, also known as bdw-gc, boehm-gc, libgc)
https://www.hboehm.info/gc/
Other
2.98k stars 407 forks source link

Out of memory error when running on in-ram file system / GCF #225

Closed sam0x17 closed 6 years ago

sam0x17 commented 6 years ago

I'm running into the fabled error:

GC Warning: Out of memory - trying to allocate less
Insufficient memory for GC_all_nils

when running crystal language binaries from within a Google Cloud Function, and when there is tons of free ram (generally 1.7 gigs). GCF uses a 2 GB in-ram file system, meaning if I create a 20 MB file, I am really using up 20 MB of ram, etc. Running crystal --version should not require 2 GB of ram, so something fishy is going on. I've checked the output of free -h right before I run crystal --version and indeed there are like 1.8 gigs of free ram.

For some reason no matter what I do, boehm-gc is throwing this error via crystal. I am wondering if maybe this is a bug? If you give me a portable linux binary I can run it in GCF to try to debug this more.

If you are unfamiliar with crystal, they bind with boehm-gc here: https://github.com/crystal-lang/crystal/blob/master/src/gc/boehm.cr

Here is the same issue posted on the crystal repo: https://github.com/crystal-lang/crystal/issues/6188

ivmai commented 6 years ago

This means GC cannot grab memory from OS (GC uses mmap or sbrk to get more memory depending on the config).

sam0x17 commented 6 years ago

@ivmai the folks over in the crystal lang issue thought you guys might be able to make something out from this strace of the crash: https://pastebin.com/hqyuZgS8

ivmai commented 6 years ago

Thanks for reporting. Based on strace, I see 2 things that together lead to the error.

brk(NULL)                               = 0x55afd8456000
open("/dev/zero", O_RDONLY)             = 3

This means brk has failed to move the program data boundary for some reason, it might be OK (I don't know who calls mremap and probably there's some relation between mremap EFAULT and brk fail). To handle brk failures, BDWGC switches to using mmap after the first one.

mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = -1 ENODEV (No such device)

3 stands for fd of /dev/zero. I think this means GCF does not allow mapping of /dev/zero. The anonymous way of getting memory works:

mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9963c9a000

So, to workaround this issue on GCF the BDWGC library should be compiled with "-D USE_MMAP_ANON" passed to CFLAGS. Could you please try this? (I know that you don't need it anymore but it would be nice to check at least.)

Alternatively I'm thinking on the run-time workaround - to switch to anonymous mmap if the first mmap(/dev/zero) failed with ENODEV.

ivmai commented 6 years ago

Alternatively I'm thinking on the run-time workaround - to switch to anonymous mmap if the first mmap(/dev/zero) failed with ENODEV.

Run-time detection is not really needed, it is OK to default to MAP_ANONYMOUS/MAP_ANON when available (at compile time).

to workaround this issue on GCF the BDWGC library should be compiled with "-D USE_MMAP_ANON" passed to CFLAGS

Hmm. In case of linux platform, USE_MMAP_ANON is defined by default (if mmap is used). Which BDWGC version and binary do you use? It's compiled to Linux/x64, right?

sam0x17 commented 6 years ago

I really don't know much about how crystal integrates with this project, but if you ask in the crystal issue some people familiar with this will be able to chime in (also there is a lot more information and an strace in there):

https://github.com/crystal-lang/crystal/issues/6188

ivmai commented 6 years ago

Note: the above commit (32704bf) is applicable only to libgc-8.0.0 or later.

ivmai commented 6 years ago

Commit eed796d should solve the issue in gc-7.6.8

bcardiff commented 6 years ago

@ivmai That commit will be included in the next 7.6 release, right? That would be 7.6.9, since 7.6.8 was released in Aug, 12th.

ivmai commented 6 years ago

This patch would be included in libgc-7.6.10 (planned for November).

sam0x17 commented 6 years ago

excellent news thanks guys!!