dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.62k stars 4.56k forks source link

VirtualReserve is allocating 256GB on illumos rather than "reserving" #104211

Open am11 opened 2 weeks ago

am11 commented 2 weeks ago

Split from https://github.com/dotnet/runtime/issues/34944#issuecomment-2198431627

We should improve this implementation to support illumos properly, so it doesn't require setting the upper limit for heap memory alloc.

With linux procfs we get more accurate or up-to-date stats than sysconf. That's why it's only used in fallback. Since on illumos procfs is binary based, it bails out early from ReadMemAvailable() function.

If there are similar concerns with sysconf() on illumos, kstat might be more accurate; something like this:

#elif defined(__sun)
    // Use kstat to get available memory on Illumos
    kstat_ctl_t *kc;
    kstat_t *ksp;
    kstat_named_t *knp;
    uint64_t free_memory = 0;

    if ((kc = kstat_open()) != NULL) {
        if ((ksp = kstat_lookup(kc, "unix", 0, "system_pages")) != NULL) {
            if (kstat_read(kc, ksp, NULL) != -1) {
                if ((knp = (kstat_named_t *)kstat_data_lookup(ksp, "pagesfree")) != NULL) {
                    free_memory = (uint64_t)knp->value.ui64 * sysconf(_SC_PAGESIZE);
                }
            }
        }
        kstat_close(kc);
    }

    if (free_memory == 0)  available = sysconf(SYSCONF_PAGES) * sysconf(_SC_PAGE_SIZE);

    available = free_memory;
gwr commented 2 weeks ago

Actually sysconf(_SC_AVPHYS_PAGES) gets us exactly the value we want here, and is a much simpler interface (and thread-safe etc).

gwr commented 2 weeks ago

It looks like we might be better off using sysconf on Linux as well. Here's a handy little test program:

#include <stdio.h>
#include <unistd.h> // sysconf

int
main(int argc, char **argv)
{
    long mem, pgs;

    pgs = sysconf(_SC_AVPHYS_PAGES);
    mem = pgs * (getpagesize() / 1024);
    printf("Available pages: %lu\n", pgs);
    printf("Available KB: %lu\n", mem);
    return (0);
}

And comparing results:

gwr@ubuntu18:/g/ws/dotnet$ head -3 /proc/meminfo
MemTotal:        8116820 kB
MemFree:         2923380 kB
MemAvailable:    7320288 kB
gwr@ubuntu18:/g/ws/dotnet$ ./getavail.linux 
Available pages: 730782
Available KB: 2923128

so it appears that on Linux, that sysconf gets you MemFree. I think that's generally what we want here anyway.

I ran this on a VM with 8BG physical ram. That 7+ GB of RAM reported by MemAvailable is questionable, but maybe that's the right thing to use on Linux. I'm not sure.

am11 commented 2 weeks ago

Not sure I understood why it is requiring us to set DOTNET_GCHeapHardLimit=<something smaller than 256 GB> on illumos if sysconf is returning the correct value. Isn't that exactly what it ends up using on illumos change (procfs path returns false on illumos). Of course "tidying up the implementation" vs. "fixing the heap mem issue for good" are two different aspects of it, latter of which is more interesting for the moment. ;)

gwr commented 2 weeks ago

Here is similar data for an 8GB VM running illumos:

gwr@oi-work:/g/ws/dotnet$ kstat -m unix -n system_pages
module: unix                            instance: 0     
name:   system_pages                    class:    pages
        availrmem                       1114195
        crtime                          0
        desfree                         16366
        desscan                         25
        econtig                         4225150976
        fastscan                        1047486
        freemem                         1178999
        kernelbase                      0
        lotsfree                        32733
        low_mem_scan                    0
        minfree                         12274
        n_throttle                      0
        nalloc                          33109279
        nalloc_calls                    17586
        nfree                           32778862
        nfree_calls                     9355
        nscan                           0
        pagesfree                       1178999
        pageslocked                     980778
        pagestotal                      2094973
        physmem                         2094974
        pp_kernel                       775618
        slowscan                        100
        snaptime                        49307.761938326

And here's what that sysconf says:

gwr@oi-work:/g/ws/dotnet$ ./getavail.illumos 
Available pages: 1179147
Available KB: 4716588

So a little over 4GB free memory.

gwr commented 2 weeks ago

OK, when illumos actually uses sysconf(_SC_AVPHYS_PAGES) in GetAvailablePhysicalMemory() then the heap size stuff works correctly.

Have a look at the latest in this comparison. This is for debugging. We could probably clean this up and integrate something like it. https://github.com/dotnet/runtime/compare/main...gwr:dotnet-runtime:illumos2

Rather than bother with config variables for HAVE__SC_AVPHYS_PAGES and HAVE__SC_PHYS_PAGES we could use defined(...) as shown here.

I'm not sure what went wrong with the config variables, but apparently something did. The cross build stuff makes it hard to know which config header is used where.

am11 commented 2 weeks ago

OK, when illumos actually uses sysconf(_SC_AVPHYS_PAGES) in GetAvailablePhysicalMemory() then the heap size stuff works correctly.

Is it not using sysconf without the change (i.e. ReadMemAvailable() returns false on illumos and we hit the fallback)?

IOW, is this change saving us from setting DOTNET_GCHeapHardLimit or is the change only saving one call to ReadMemAvailable() that always returns false on the platform?

gwr commented 2 weeks ago

Looks like my changes were unnecessary. I must have been confused somehow. Maybe running a different build than I thought? Anyway the GetAvailablePhysicalMemory code seems to be OK as it is. Here's what happens with the current code:

gdb dotnet
GNU gdb (GDB) 14.2
[...]
Reading symbols from dotnet...
(gdb) break GetAvailablePhysicalMemory 
Function "GetAvailablePhysicalMemory" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (GetAvailablePhysicalMemory) pending.
(gdb) run helloworld/bin/Debug/net9.0/helloworld.dll
Starting program: /tank/ws/dnt/dotnet helloworld/bin/Debug/net9.0/helloworld.dll

Thread 2 hit Breakpoint 1, GetAvailablePhysicalMemory ()
    at /g/ws/dotnet/runtime/src/coreclr/gc/unix/gcenv.unix.cpp:1236
1236    {
(gdb) next
1237        uint64_t available = 0;
(gdb) next
1265        if (tryReadMemInfo)
(gdb) next
1269            tryReadMemInfo = ReadMemAvailable(&available);
(gdb) next
1272        if (!tryReadMemInfo)
(gdb) next
1276            available = sysconf(SYSCONF_PAGES) * sysconf(_SC_PAGE_SIZE);
(gdb) next
1280        return available;
(gdb) print available / 1024
$1 = 4298396
(gdb) cont
Continuing.
[New LWP    6        ]
[New LWP    7        ]
Hello, World!
[Inferior 1 (process 2368    ) exited normally]

That was running with this code: This is with my illumos1 branch: https://github.com/dotnet/runtime/compare/main...gwr:dotnet-runtime:illumos1 That has no changes in coreclr/gc/unix/gcenv.unix.cpp (just other chanes I needed to unblock my work)

I guess we can close this issue. Thanks.

am11 commented 2 weeks ago

Thank you for confirming. Glad to know that we don't need DOTNET_GCHeapHardLimit=1C00000 or via runtimeconfig.json file anymore. :) (I'm currently on arm64 machine, and emulating an x64 VM is quite slow but I'll try once I'm on amd64 machine)

This is with my illumos1 branch: https://github.com/dotnet/runtime/compare/main...gwr:dotnet-runtime:illumos1

It would be nice if we could call mlock under vfork() as you suggested (https://github.com/am11/runtime/commit/f1320973588a601c65b1686b420f90874f273b41?w=1) or other types of fork/clone to see if it emulates IPI and saves us from requiring mlock privileges. That would fix couple of issues with linux and freebsd jail as well (e.g. https://github.com/dotnet/runtime/issues/44417). Dealing with mlock() has been bit of a challenge across the board so it will be a general goodness IMHO.

gwr commented 2 weeks ago

It turns out GetAvailablePhysicalMemory() was fine, but the mmap call needed MAP_NORESERVE. Should we reopen this? Get a new issue?

am11 commented 2 weeks ago

Could you please open a pull request with that change?

gwr commented 2 weeks ago

The PR is up. 104275 As I mentioned there, I might suggest changing the issue title. (I don't appear to be able to do that.) I'd prefer: Need MAP_NORESERVE in VirtualReserve for illumos or something like that. Or maybe: VirtualReserve of 256GB fails on illumos