varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.66k stars 376 forks source link

docfix: >4x malloc overhead: jemalloc + THP issue? #3511

Closed idl0r closed 3 years ago

idl0r commented 3 years ago

Hi,

we're running 6.4.0-1~buster (Debian 10) from https://packagecloud.io/varnishcache The system has 128G RAM.

varnishd -f /etc/varnish/example1.vcl -S /etc/varnish/secret -i example1 -n /var/lib/varnish/example1/ -p timeout_idle=300 -a 127.0.0.1:6151 -a 127.0.0.1:6152,PROXY -T 127.0.0.1:5151 -s malloc,32G -p default_ttl=1200 -p default_grace=86400

varnishd or better vcache user threads are very hungry and took ~92% of the system memory at the time writing this issue.

atop screenshot

It also looks like varnish kills or restarts the threads at some point: https://ibb.co/nCygwX0

Right now, due to a limited event, we're caching quite a lot .mp4 files, some may be around or above 1G.

vcl 4.0;

import std; # VMOD std

sub vcl_backend_error {
    set beresp.http.Content-Type = "text/html; charset=utf-8";
    synthetic(regsub(std.fileread("/etc/varnish/error_example1.html"), "bereq\.xid", server.hostname + "-" + bereq.xid));
    return(deliver);
}

acl acl_local {
    "localhost";
    "127.0.0.0/8";
    "192.168.92.0/24";
    "192.168.95.0/24";
}

acl acl_purge {
    "1.2.3.4";
    "11.22.33.44";
    "2.3.4.5";
    "192.168.100.156";
    "192.168.100.157";
    "localhost";
    "127.0.0.1";
}

acl acl_debug {
    "1.2.3.4";
    "11.22.33.44";
    "2.3.4.5";
    "192.168.100.156";
    "192.168.100.157";
    "localhost";
    "127.0.0.1";
}

backend backend_example1_www {
    .host = "127.0.0.1";

    .port = "50040";

    .probe = {
        .request =
            "HEAD /status.html HTTP/1.1"
            "Host: www.example1.com"
            "User-Agent: varnish"
            "Connection: close";
        .interval = 5s;
        .timeout = 3s; # The timeout for the probe
        .window = 8; # default value; How many of the latest polls we examine to determine backend health
        .threshold = 3; # default value; How many of the polls in .window must have succeeded for us to consider the backend healthy
        .expected_response = 200;
    }
}

backend backend_example1_staging_www {
        .host = "127.0.0.1";

        .port = "50240";

        .probe = {
                .request =
                        "HEAD /status.html HTTP/1.1"
                        "Host: stage.example1.com"
                        "User-Agent: varnish"
                        "Connection: close";
                .interval = 5s;
                .timeout = 3s; # The timeout for the probe
                .window = 8; # default value; How many of the latest polls we examine to determine backend health
                .threshold = 3; # default value; How many of the polls in .window must have succeeded for us to consider the backend healthy
                .expected_response = 200;
        }
}

sub vcl_recv {
    std.log("XID: " + server.hostname + "-" + req.xid);

    unset req.http.proxy;

    if (client.ip ~ acl_local && req.http.X-Forwarded-For) {
        set req.http.X-Forwarded-For = regsub(req.http.X-Forwarded-For, ",.*$", "");
        std.log("DEBUG: X-Forwarded-For: internal, using provided X-Forwarded-For=" + req.http.X-Forwarded-For);
    }
    else {
        std.log("DEBUG: X-Forwarded-For: outside or not set: X-Forwarded-For=" + client.ip);
        set req.http.X-Forwarded-For = client.ip;
    }

    if(req.http.X-Cacheable) {
        unset req.http.X-Cacheable;
    }

    if (req.method == "PURGE") {
        if(client.ip ~ acl_purge || (client.ip ~ acl_local && req.http.X-Forwarded-For && std.ip(req.http.X-Forwarded-For, "0.0.0.0") ~ acl_purge) ) {
            return (purge);
        }
        else {
            return (synth(405, "Not allowed"));
        }
    }

    if ( req.http.host ~ "(?i)^www\.example1\.com$" ||
        req.http.host ~ "(?i)^example1\.com$" ||
        req.http.host ~ "(?i)^www\.ex1\.com$" ||
        req.http.host ~ "(?i)^ex1\.com$" ||
        req.http.host ~ "(?i)^exa2\.com$" ||
        req.http.host ~ "(?i)^www\.exa2\.com$" ||
        req.http.host ~ "(?i)^www\.exa3\.com$" ||
        req.http.host ~ "(?i)^foobar\.example1\.com\.somecorp\.com$" ||
        req.http.host ~ "(?i)^foobar2\.somecorp\.com"
    ) {
        set req.backend_hint = backend_example1_www;
    } elseif ( req.http.host ~ "(?i)^stage\.example1\.com$" ||
               req.http.host ~ "(?i)^stage\.exa2\.example1\.com$" ||
               req.http.host ~ "(?i)^foobar.stage.example1.com\.somecorp\.com$"
    ) {

        set req.backend_hint = backend_example1_staging_www;
    } else {
        set req.http.X-Cacheable = "No: Wrong host=" + req.http.host;
        return(synth(503, "Wrong host"));
    }

    if (std.healthy(req.backend_hint)) {
        std.log("DEBUG: vcl_recv(): Backend " + req.backend_hint + " is healthy");
    }
    else {
        std.log("DEBUG: vcl_recv(): Backend " + req.backend_hint + " is *not* healthy");
    }

    if (req.method == "PRI") {
        /* We do not support SPDY or HTTP/2.0 */
        return (synth(405));
    }
    if (req.method != "GET" &&
        req.method != "HEAD" &&
        req.method != "PUT" &&
        req.method != "POST" &&
        req.method != "TRACE" &&
        req.method != "OPTIONS" &&
        req.method != "DELETE") {

        /* Non-RFC2616 or CONNECT which is weird. */
        return (pipe);
        }

    /* We don't support chunked uploads, except when piping. */
    if ((req.method == "POST" || req.method == "PUT") &&
        req.http.transfer-encoding ~ "chunked") {
        return(pipe);
    }

    if (req.method != "GET" && req.method != "HEAD") {
        /* We only deal with GET and HEAD by default */
        return (pass);
    }

    if(req.http.Range ~ "(?i)bytes=[^,]+,\s*.+$") {
        unset req.http.Range;
    }

    if (client.ip ~ acl_debug && req.http.Cookie ~ "X_BYPASS_CACHE=true" ) {
        set req.http.X-Cacheable = "No: Excluded because of debug-cookie";
        return (pass);
    }

    return (hash);
}

sub vcl_hash {

    hash_data(req.url);

    if (req.http.host) {
        hash_data(req.http.host);
    }
    else {
        hash_data(server.ip); # The IP address of the socket on which the client connection was received
    }

    return (lookup);
}

sub vcl_hit {
    std.log("DEBUG: vcl_hit(): obj.ttl=" + obj.ttl + ", obj.grace = " + obj.grace);

    if (obj.ttl + obj.grace > 0s) {
        if (std.healthy(req.backend_hint)) {
            set req.http.X-Grace = "normal(limited)";
        }
        else {
            set req.http.X-Grace = "full";
        }
    }
    else {
        set req.http.X-Grace = "none";
    }

    return (deliver);
}

sub vcl_deliver {
    if(req.http.X-Grace) {
        set resp.http.X-Grace = req.http.X-Grace; # Debug grace
    }

    if (obj.hits > 0) {
        set resp.http.X-Cache = "HIT";
    }
    else {
        set resp.http.X-Cache = "MISS";
    }

    std.log("DEBUG: vcl_deliver(): Hits on " + req.url + ": " + obj.hits);

    if(resp.http.Set-Cookie && resp.http.X-Cacheable ~ "Yes") {
        std.log("DEBUG: vcl_deliver(): WARNING: Object is X-Cachable='" + resp.http.X-Cacheable + "' but Set-Cookie='" + resp.http.Set-Cookie + "' will be delivered anyway!");
    }

    if(client.ip !~ acl_debug && std.ip(req.http.X-Forwarded-For, "0.0.0.0") !~ acl_debug) {
        unset resp.http.X-Cache;
        unset resp.http.X-Cacheable;
        unset resp.http.X-Grace;
    }

    if(resp.http.X-Cacheable !~ "Yes" && !resp.http.Cache-Control) {
        set resp.http.Cache-Control = "private";
    }

    set resp.http.Via = regsub(resp.proto, "^[^/]+/", "") + " Varnish";

    return (deliver);
}

/*
 * We can come here "invisibly" with the following errors: 413, 417 & 503
 */
sub vcl_synth {
    set resp.http.Content-Type = "text/html; charset=utf-8";
    set resp.http.Retry-After = "5";
    synthetic({"<!DOCTYPE html><html><head><title>"} + resp.status + " " + resp.reason + {"</title></head><body><h1>Error "} + resp.status + " " + resp.reason + {"</h1><p>"} + resp.reason + {"</p><h3>Guru Meditation:</h3><p>XID: "} + server.hostname + "-" + req.xid + {"</p><hr><p>Varnish cache server</p></body></html>"});
    return (deliver);
}

sub vcl_backend_fetch {
    return (fetch);
}

sub vcl_backend_response {

    if (beresp.status >= 500 && beresp.status < 600) {
        if (bereq.is_bgfetch) {
            std.log("DEBUG: vcl_backend_response(): Got response code " + beresp.status + " during background fetch, returning stale object");
            return (abandon);
        }

        set beresp.uncacheable = true;
    }

    if(bereq.http.X-Cacheable) {
        std.log("DEBUG: vcl_backend_response(): Got X-Cachable from vcl_recv(): " + bereq.http.X-Cacheable);
        set beresp.http.X-Cacheable = bereq.http.X-Cacheable;
    }

    if(beresp.http.X-Cacheable ~ "^No:") {
        set beresp.ttl = 120s;
        std.log("DEBUG: vcl_backend_response(): beresp.ttl = " + beresp.ttl);
        set beresp.uncacheable = true;
        unset beresp.http.Cache-Control;
        return(deliver);
    }

    if (bereq.url ~ "^/admin") {
        set beresp.http.Cache-Control = "private, no-cache";
        set beresp.ttl = 0s;
        set beresp.uncacheable = true;
        set beresp.http.X-Cacheable = "No: Excluded";
        return (pass);
    }

    if (bereq.url ~ "^/admin/edit" ||
        bereq.url ~ "^/engines" ||
        bereq.url ~ "^/start" ||
        bereq.url ~ "^/admin/render/default" ||
        bereq.url ~ "^/admin/cmsImg/default" ||
        bereq.url ~ "^/tools" ||
        bereq.url ~ "^/modules/tools" ||
        bereq.url ~ "^/modules/api" ||
        bereq.url ~ "^/modules/api/bundles" ||
        bereq.url ~ "^/admin/dashboard" ||
        bereq.url ~ "^/admin/contribute" ||
        bereq.url ~ "^/foobar" ||
        bereq.url ~ "^/admin/login" ||
        bereq.url ~ "^/admin/logout" ||
        bereq.url ~ "^/admin/find"
    ) {
        set beresp.http.Cache-Control = "private, no-cache";
        set beresp.ttl = 0s;
        set beresp.uncacheable = true;
        set beresp.http.X-Cacheable = "No: Excluded";
        return (pass);
    }

    if (bereq.url ~ "^/videos.html" ||
        bereq.url ~ "^/player.html" ||
        bereq.url ~ "^/guestbook.html" ||
        bereq.url ~ "^/meeting.html" ||
        bereq.url ~ "^/Petition"
    ) {
        set beresp.http.X-Cacheable = "No: Excluded";
        return (pass);
    }

    if (bereq.url ~ "^/home/petition/online-petition.html" ||
        bereq.url ~ "^/home/guest/petitions.html"
    ) {
        set beresp.http.X-Cacheable = "No: Excluded";
        return (pass);
    }

    if (bereq.url ~ "^/home/press.html" ||
    bereq.url ~ "^/home/news.html"
    ) {
        set beresp.http.X-Cacheable = "No: Excluded";
        return (pass);
    }

    if (bereq.url ~ "\.do$") {
        set beresp.http.X-Cacheable = "No: Excluded";
        return (pass);
    }

    if (bereq.url ~ "\.mp4") {
        set beresp.ttl = 2h;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if (bereq.url ~ "^/home/special.html") {
        set beresp.ttl = 1m;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if (bereq.url ~ "^/files/") {
        set beresp.ttl = 5m;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if (bereq.url ~ "^/home/documents/docs.html" ||
        bereq.url ~ "^/summary.html"
    ) {
        set beresp.ttl = 1m;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if (bereq.url ~ "^/home.html" ||
        bereq.url ~ "^/home/index.html" ||
        bereq.url ~ "^/home/docs/" ||
        bereq.url ~ "^/home/users/profiles/" ||
        bereq.url ~ "^/home/users/groups/" ||
        bereq.url ~ "^/home/videos/videolist.html" ||
        bereq.url ~ "^/home/guests/summary.html"
    ) {
        set beresp.ttl = 1m;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if (bereq.url ~ "^/home/dedi/grem/") {
        set beresp.ttl = 1m;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if (bereq.url ~ "^/home/example1/disks/") {
        set beresp.ttl = 60m;
        set beresp.http.X-Cacheable = "Yes: Forced";
        set beresp.uncacheable = false;
        return (deliver);
    }

    if(beresp.http.Set-Cookie) {
        std.log("DEBUG: vcl_backend_response(): Got Set-Cookie '" + beresp.http.Set-Cookie + "' from " + bereq.backend + ", removing it");
        unset beresp.http.Set-Cookie;
    }

    if (
      beresp.ttl <= 0s ||
      beresp.http.Surrogate-control ~ "(?i)no-store" ||
      (!beresp.http.Surrogate-Control &&
        beresp.http.Cache-Control ~ "(?i:no-cache|no-store|private)")
    ) {
        std.log("DEBUG: vcl_backend_response(): '" + bereq.url + "' Not Cacheable");
        set beresp.http.X-Cacheable = "No: Not Cacheable";
        set beresp.ttl = 120s;
        std.log("DEBUG: vcl_backend_response(): beresp.ttl = " + beresp.ttl);
        set beresp.uncacheable = true;
        return(deliver);
    }

    if (!beresp.http.X-Cacheable) {
        set beresp.http.X-Cacheable = "Yes";
        set beresp.uncacheable = false;
    }

    std.log("DEBUG: vcl_backend_response(): beresp.ttl = " + beresp.ttl);
    return (deliver);
}
bsdphk commented 3 years ago

This sounds pretty much like one would expect Varnish to work in such circumstances.

The entire point of Varnish is to use the available resources to efficiently deliver the content and as such varnish sees all RAM and CPU as fair game.

idl0r commented 3 years ago

uhm, I'm not sure if I get your point.

This sounds pretty much like one would expect Varnish to work in such circumstances.

What circumstances? That I try to use (and limit) RAM instead of file based storage, because it's fast and plenty available?

https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#malloc

Be aware that the size limitation only limits the actual storage and that the approximately 1k of memory per object, used for various internal structures, is included in the actual storage as well.

The size parameter specifies the maximum amount of memory varnishd will allocate. The size is assumed to be in bytes, unless followed by one of the following suffixes:

So with 128G available and 32G being set, even with overhead, shouldn't it stay below 128G? What is malloc then used for?

nigoroll commented 3 years ago

From support cases with our clients, I very much suspect you are hitting an issue related to both jemalloc and Transparent Huge Pages. This problem typically results in 4x overhead.

That said, this is still a support question and you are advised to resort to our support channels The issue template clearly point this out, but you have not even used it.

idl0r commented 3 years ago

@nigoroll to be honest, I haven't thought that this is a support request and even now I don't get it. Either I completely misunderstood -s malloc or that seems to be a bug / leak at the first glance?

<!--- Things to check before you report a bug

    - Is the issue you are seeing clearly an issue in varnish core or
      could it be a support question? We do not use github issues for
      support questions, please refer to
      http://varnish-cache.org/support/index.html when in doubt.

I think this is a varnish core issue

    - For panics (varnish crashes), bug reports are most useful if

      - you are running recent code
    - ideally master
    - but at least the latest release of a supported version

      - you got
    - debuginfo packages installed when running binaries from
      packages if these are available from the package source
      you are using

    - or have compiled with debug information whenever possible
      (configure --enable-debugging-symbols)

There is no panic or crash from what I can see

    If you have considered these recommendations, please go ahead and
    follow this template
-->
<!--- Provide a general summary of the issue in the Title above -->

I did

Expected Behavior

-s malloc,32G shouldn't reach 128G, even with overhead, even if we'd say malloc *3

<!--- Did you check that there are no similar bug reports or pull requests? -->
<!---
    If your panic happens in the child_sigsegv_handler function, look at the
    backtrace to determine whether it is similar to another issue. When in
    doubt, open a new one and it will be closed as a duplicate if needed.
-->
<!--- If you're describing a bug, tell us what should happen -->
<!--- If you're suggesting a change/improvement, tell us how it should work -->
<!---
    If it's a packaging bug (including sysv or systemd services bugs) please
    open an issue on varnishcache/pkg-varnish-cache instead.
-->
<!---
    If it's a feature request, please start a thread on the misc list instead.
    https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
-->

It's not a feature request I did search the issues

## Current Behavior
<!--- If describing a bug, tell us what happens instead of the expected behavior -->
<!--- If suggesting a change/improvement, explain the difference from current behavior -->

I basically did?

## Possible Solution
<!--- Not obligatory, but suggest a fix/reason for the bug, -->
<!--- or ideas how to implement the addition or change -->

I have no idea

## Steps to Reproduce (for bugs)
<!--- Provide a link to a live example, or an unambiguous set of steps to -->
<!--- reproduce this bug. Include code to reproduce, if relevant -->
1.
2.
3.
4.

I provided the VCL and what I think might be causing it. The requests against the mp4 files. There is no test case to reproduce it (yet).

## Context
<!--- How has this issue affected you? What are you trying to accomplish? -->
<!--- Providing context helps us come up with a solution that is most useful in the real world -->

I did

## Your Environment
<!--- Include as many relevant details about the environme
nt you experienced the bug in -->
* Version used:
* Operating System and version:
* Source of binary packages used (if any)

I also did

nigoroll commented 3 years ago

I understand your point. And I tell you that

-s malloc,32G shouldn't reach 128G, even with overhead

is not necessarily true.

I will happily look at your report if you can reproduce it with THP disabled.

idl0r commented 3 years ago

You seem to be right, that it takes 4x of the actual defined limit. We're reduced the limit to 24G an it's final size is around 96G. Also a varnishstat -1:

MGT.uptime              246186         5.57 Management process uptime
MGT.child_start              4         0.00 Child process started
MGT.child_exit               0         0.00 Child process normal exit
MGT.child_stop               0         0.00 Child process unexpected exit
MGT.child_died               3         0.00 Child process died (signal)
MGT.child_dump               0         0.00 Child process core dumped
MGT.child_panic              0         0.00 Child process panic
MAIN.summs              314494         7.12 stat summ operations
MAIN.uptime              44181         1.00 Child process uptime
MAIN.sess_conn          110644         2.50 Sessions accepted
MAIN.sess_fail               0         0.00 Session accept failures
MAIN.sess_fail_econnaborted            0         0.00 Session accept failures: connection aborted
MAIN.sess_fail_eintr                   0         0.00 Session accept failures: interrupted system call
MAIN.sess_fail_emfile                  0         0.00 Session accept failures: too many open files
MAIN.sess_fail_ebadf                   0         0.00 Session accept failures: bad file descriptor
MAIN.sess_fail_enomem                  0         0.00 Session accept failures: not enough memory
MAIN.sess_fail_other                   0         0.00 Session accept failures: other
MAIN.client_req_400                    5         0.00 Client requests received, subject to 400 errors
MAIN.client_req_417                    0         0.00 Client requests received, subject to 417 errors
MAIN.client_req                   110649         2.50 Good client requests received
MAIN.cache_hit                     90221         2.04 Cache hits
MAIN.cache_hit_grace                7112         0.16 Cache grace hits
MAIN.cache_hitpass                     0         0.00 Cache hits for pass.
MAIN.cache_hitmiss                  9237         0.21 Cache hits for miss.
MAIN.cache_miss                    19777         0.45 Cache misses
MAIN.backend_conn                   1490         0.03 Backend conn. success
MAIN.backend_unhealthy                 0         0.00 Backend conn. not attempted
MAIN.backend_busy                      0         0.00 Backend conn. too many
MAIN.backend_fail                      0         0.00 Backend conn. failures
MAIN.backend_reuse                 25998         0.59 Backend conn. reuses
MAIN.backend_recycle               27452         0.62 Backend conn. recycles
MAIN.backend_retry                     1         0.00 Backend conn. retry
MAIN.fetch_head                        0         0.00 Fetch no body (HEAD)
MAIN.fetch_length                  21880         0.50 Fetch with Length
MAIN.fetch_chunked                   198         0.00 Fetch chunked
MAIN.fetch_eof                         0         0.00 Fetch EOF
MAIN.fetch_bad                         0         0.00 Fetch bad T-E
MAIN.fetch_none                      740         0.02 Fetch no body
MAIN.fetch_1xx                         0         0.00 Fetch no body (1xx)
MAIN.fetch_204                         0         0.00 Fetch no body (204)
MAIN.fetch_304                      4668         0.11 Fetch no body (304)
MAIN.fetch_failed                     31         0.00 Fetch failed (all causes)
MAIN.fetch_no_thread                   0         0.00 Fetch failed (no thread)
MAIN.pools                             2          .   Number of thread pools
MAIN.threads                         200          .   Total number of threads
MAIN.threads_limited                   0         0.00 Threads hit max
MAIN.threads_created                 200         0.00 Threads created
MAIN.threads_destroyed                 0         0.00 Threads destroyed
MAIN.threads_failed                    0         0.00 Thread creation failed
MAIN.thread_queue_len                  0          .   Length of session queue
MAIN.busy_sleep                       10         0.00 Number of requests sent to sleep on busy objhdr
MAIN.busy_wakeup                      10         0.00 Number of requests woken after sleep on busy objhdr
MAIN.busy_killed                       0         0.00 Number of requests killed after sleep on busy objhdr
MAIN.sess_queued                       0         0.00 Sessions queued for thread
MAIN.sess_dropped                      0         0.00 Sessions dropped for thread
MAIN.req_dropped                       0         0.00 Requests dropped
MAIN.n_object                      13327          .   object structs made
MAIN.n_vampireobject                   0          .   unresurrected objects
MAIN.n_objectcore                  13355          .   objectcore structs made
MAIN.n_objecthead                   7745          .   objecthead structs made
MAIN.n_backend                        16          .   Number of backends
MAIN.n_expired                         1         0.00 Number of expired objects
MAIN.n_lru_nuked                    1764         0.04 Number of LRU nuked objects
MAIN.n_lru_moved                   70352         1.59 Number of LRU moved objects
MAIN.n_lru_limited                    29         0.00 Reached nuke_limit
MAIN.losthdr                           0         0.00 HTTP header overflows
MAIN.s_sess                       110644         2.50 Total sessions seen
MAIN.n_pipe                            0          .   Number of ongoing pipe sessions
MAIN.pipe_limited                      0         0.00 Pipes hit pipe_sess_max
MAIN.s_pipe                            0         0.00 Total pipe sessions seen
MAIN.s_pass                          641         0.01 Total pass-ed requests seen
MAIN.s_fetch                       20418         0.46 Total backend fetches initiated
MAIN.s_synth                           0         0.00 Total synthetic responses made
MAIN.s_req_hdrbytes             73789945      1670.17 Request header bytes
MAIN.s_req_bodybytes              132278         2.99 Request body bytes
MAIN.s_resp_hdrbytes            44405642      1005.08 Response header bytes
MAIN.s_resp_bodybytes       283485397345   6416454.98 Response body bytes
MAIN.s_pipe_hdrbytes                   0         0.00 Pipe request header bytes
MAIN.s_pipe_in                         0         0.00 Piped bytes from client
MAIN.s_pipe_out                        0         0.00 Piped bytes to client
MAIN.sess_closed                  110644         2.50 Session Closed
MAIN.sess_closed_err               15630         0.35 Session Closed with error
MAIN.sess_readahead                    0         0.00 Session Read Ahead
MAIN.sess_herd                         0         0.00 Session herd
MAIN.sc_rem_close                   1710         0.04 Session OK  REM_CLOSE
MAIN.sc_req_close                  93134         2.11 Session OK  REQ_CLOSE
MAIN.sc_req_http10                  2342         0.05 Session Err REQ_HTTP10
MAIN.sc_rx_bad                         0         0.00 Session Err RX_BAD
MAIN.sc_rx_body                        0         0.00 Session Err RX_BODY
MAIN.sc_rx_junk                        5         0.00 Session Err RX_JUNK
MAIN.sc_rx_overflow                    0         0.00 Session Err RX_OVERFLOW
MAIN.sc_rx_timeout                     0         0.00 Session Err RX_TIMEOUT
MAIN.sc_rx_close_idle                  0         0.00 Session Err RX_CLOSE_IDLE
MAIN.sc_tx_pipe                        0         0.00 Session OK  TX_PIPE
MAIN.sc_tx_error                   13283         0.30 Session Err TX_ERROR
MAIN.sc_tx_eof                       158         0.00 Session OK  TX_EOF
MAIN.sc_resp_close                     0         0.00 Session OK  RESP_CLOSE
MAIN.sc_overload                       0         0.00 Session Err OVERLOAD
MAIN.sc_pipe_overflow                  0         0.00 Session Err PIPE_OVERFLOW
MAIN.sc_range_short                    0         0.00 Session Err RANGE_SHORT
MAIN.sc_req_http20                     0         0.00 Session Err REQ_HTTP20
MAIN.sc_vcl_failure                    0         0.00 Session Err VCL_FAILURE
MAIN.client_resp_500                   0         0.00 Delivery failed due to insufficient workspace.
MAIN.ws_backend_overflow               0         0.00 workspace_backend overflows
MAIN.ws_client_overflow                0         0.00 workspace_client overflows
MAIN.ws_thread_overflow                0         0.00 workspace_thread overflows
MAIN.ws_session_overflow               0         0.00 workspace_session overflows
MAIN.shm_records                10584463       239.57 SHM records
MAIN.shm_writes                   885079        20.03 SHM writes
MAIN.shm_flushes                      58         0.00 SHM flushes due to overflow
MAIN.shm_cont                       3611         0.08 SHM MTX contention
MAIN.shm_cycles                        5         0.00 SHM cycles through buffer
MAIN.backend_req                   27487         0.62 Backend requests made
MAIN.n_vcl                             8          .   Number of loaded VCLs in total
MAIN.n_vcl_avail                       8          .   Number of VCLs available
MAIN.n_vcl_discard                     0          .   Number of discarded VCLs
MAIN.vcl_fail                          0         0.00 VCL failures
MAIN.bans                              1          .   Count of bans
MAIN.bans_completed                    1          .   Number of bans marked 'completed'
MAIN.bans_obj                          0          .   Number of bans using obj.*
MAIN.bans_req                          0          .   Number of bans using req.*
MAIN.bans_added                        1         0.00 Bans added
MAIN.bans_deleted                      0         0.00 Bans deleted
MAIN.bans_tested                       0         0.00 Bans tested against objects (lookup)
MAIN.bans_obj_killed                   0         0.00 Objects killed by bans (lookup)
MAIN.bans_lurker_tested                0         0.00 Bans tested against objects (lurker)
MAIN.bans_tests_tested                 0         0.00 Ban tests tested against objects (lookup)
MAIN.bans_lurker_tests_tested            0         0.00 Ban tests tested against objects (lurker)
MAIN.bans_lurker_obj_killed              0         0.00 Objects killed by bans (lurker)
MAIN.bans_lurker_obj_killed_cutoff            0         0.00 Objects killed by bans for cutoff (lurker)
MAIN.bans_dups                                0         0.00 Bans superseded by other bans
MAIN.bans_lurker_contention                   0         0.00 Lurker gave way for lookup
MAIN.bans_persisted_bytes                    16          .   Bytes used by the persisted ban lists
MAIN.bans_persisted_fragmentation             0          .   Extra bytes in persisted ban lists due to fragmentation
MAIN.n_purges                                 0         0.00 Number of purge operations executed
MAIN.n_obj_purged                             0         0.00 Number of purged objects
MAIN.exp_mailed                           40364         0.91 Number of objects mailed to expiry thread
MAIN.exp_received                         40364         0.91 Number of objects received by expiry thread
MAIN.hcb_nolock                          109998         2.49 HCB Lookups without lock
MAIN.hcb_lock                              9481         0.21 HCB Lookups with lock
MAIN.hcb_insert                            9481         0.21 HCB Inserts
MAIN.esi_errors                               0         0.00 ESI parse errors (unlock)
MAIN.esi_warnings                             0         0.00 ESI parse warnings (unlock)
MAIN.vmods                                    1          .   Loaded VMODs
MAIN.n_gzip                                   0         0.00 Gzip operations
MAIN.n_gunzip                               708         0.02 Gunzip operations
MAIN.n_test_gunzip                        10403         0.24 Test gunzip operations
LCK.backend.creat                            16         0.00 Created locks
LCK.backend.destroy                           0         0.00 Destroyed locks
LCK.backend.locks                         54976         1.24 Lock Operations
LCK.backend.dbg_busy                          0         0.00 Contended lock operations
LCK.backend.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.ban.creat                                 1         0.00 Created locks
LCK.ban.destroy                               0         0.00 Destroyed locks
LCK.ban.locks                             42153         0.95 Lock Operations
LCK.ban.dbg_busy                              0         0.00 Contended lock operations
LCK.ban.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.busyobj.creat                         27743         0.63 Created locks
LCK.busyobj.destroy                       27706         0.63 Destroyed locks
LCK.busyobj.locks                       2428786        54.97 Lock Operations
LCK.busyobj.dbg_busy                          0         0.00 Contended lock operations
LCK.busyobj.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.cli.creat                                 1         0.00 Created locks
LCK.cli.destroy                               0         0.00 Destroyed locks
LCK.cli.locks                             14765         0.33 Lock Operations
LCK.cli.dbg_busy                              0         0.00 Contended lock operations
LCK.cli.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.exp.creat                                 1         0.00 Created locks
LCK.exp.destroy                               0         0.00 Destroyed locks
LCK.exp.locks                            149437         3.38 Lock Operations
LCK.exp.dbg_busy                              0         0.00 Contended lock operations
LCK.exp.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.hcb.creat                                 1         0.00 Created locks
LCK.hcb.destroy                               0         0.00 Destroyed locks
LCK.hcb.locks                             11509         0.26 Lock Operations
LCK.hcb.dbg_busy                              0         0.00 Contended lock operations
LCK.hcb.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.lru.creat                                 2         0.00 Created locks
LCK.lru.destroy                               0         0.00 Destroyed locks
LCK.lru.locks                            112481         2.55 Lock Operations
LCK.lru.dbg_busy                              0         0.00 Contended lock operations
LCK.lru.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.mempool.creat                             5         0.00 Created locks
LCK.mempool.destroy                           0         0.00 Destroyed locks
LCK.mempool.locks                        717072        16.23 Lock Operations
LCK.mempool.dbg_busy                          0         0.00 Contended lock operations
LCK.mempool.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.objhdr.creat                           9531         0.22 Created locks
LCK.objhdr.destroy                         1782         0.04 Destroyed locks
LCK.objhdr.locks                         520986        11.79 Lock Operations
LCK.objhdr.dbg_busy                           0         0.00 Contended lock operations
LCK.objhdr.dbg_try_fail                       0         0.00 Contended trylock operations
LCK.perpool.creat                             2         0.00 Created locks
LCK.perpool.destroy                           0         0.00 Destroyed locks
LCK.perpool.locks                        472264        10.69 Lock Operations
LCK.perpool.dbg_busy                          0         0.00 Contended lock operations
LCK.perpool.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.pipestat.creat                            1         0.00 Created locks
LCK.pipestat.destroy                          0         0.00 Destroyed locks
LCK.pipestat.locks                            0         0.00 Lock Operations
LCK.pipestat.dbg_busy                         0         0.00 Contended lock operations
LCK.pipestat.dbg_try_fail                     0         0.00 Contended trylock operations
LCK.probe.creat                               1         0.00 Created locks
LCK.probe.destroy                             0         0.00 Destroyed locks
LCK.probe.locks                           76501         1.73 Lock Operations
LCK.probe.dbg_busy                            0         0.00 Contended lock operations
LCK.probe.dbg_try_fail                        0         0.00 Contended trylock operations
LCK.sess.creat                           110666         2.50 Created locks
LCK.sess.destroy                         110644         2.50 Destroyed locks
LCK.sess.locks                           165837         3.75 Lock Operations
LCK.sess.dbg_busy                             0         0.00 Contended lock operations
LCK.sess.dbg_try_fail                         0         0.00 Contended trylock operations
LCK.tcp_pool.creat                            3         0.00 Created locks
LCK.tcp_pool.destroy                          0         0.00 Destroyed locks
LCK.tcp_pool.locks                       108459         2.45 Lock Operations
LCK.tcp_pool.dbg_busy                         0         0.00 Contended lock operations
LCK.tcp_pool.dbg_try_fail                     0         0.00 Contended trylock operations
LCK.vbe.creat                                 1         0.00 Created locks
LCK.vbe.destroy                               0         0.00 Destroyed locks
LCK.vbe.locks                             33915         0.77 Lock Operations
LCK.vbe.dbg_busy                              0         0.00 Contended lock operations
LCK.vbe.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.vcapace.creat                             1         0.00 Created locks
LCK.vcapace.destroy                           0         0.00 Destroyed locks
LCK.vcapace.locks                             0         0.00 Lock Operations
LCK.vcapace.dbg_busy                          0         0.00 Contended lock operations
LCK.vcapace.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.vcl.creat                                 1         0.00 Created locks
LCK.vcl.destroy                               0         0.00 Destroyed locks
LCK.vcl.locks                             56893         1.29 Lock Operations
LCK.vcl.dbg_busy                              0         0.00 Contended lock operations
LCK.vcl.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.vxid.creat                                1         0.00 Created locks
LCK.vxid.destroy                              0         0.00 Destroyed locks
LCK.vxid.locks                               49         0.00 Lock Operations
LCK.vxid.dbg_busy                             0         0.00 Contended lock operations
LCK.vxid.dbg_try_fail                         0         0.00 Contended trylock operations
LCK.waiter.creat                              2         0.00 Created locks
LCK.waiter.destroy                            0         0.00 Destroyed locks
LCK.waiter.locks                         100010         2.26 Lock Operations
LCK.waiter.dbg_busy                           0         0.00 Contended lock operations
LCK.waiter.dbg_try_fail                       0         0.00 Contended trylock operations
LCK.wq.creat                                  1         0.00 Created locks
LCK.wq.destroy                                0         0.00 Destroyed locks
LCK.wq.locks                              63485         1.44 Lock Operations
LCK.wq.dbg_busy                               0         0.00 Contended lock operations
LCK.wq.dbg_try_fail                           0         0.00 Contended trylock operations
LCK.wstat.creat                               1         0.00 Created locks
LCK.wstat.destroy                             0         0.00 Destroyed locks
LCK.wstat.locks                          192661         4.36 Lock Operations
LCK.wstat.dbg_busy                            0         0.00 Contended lock operations
LCK.wstat.dbg_try_fail                        0         0.00 Contended trylock operations
MEMPOOL.busyobj.live                          0          .   In use
MEMPOOL.busyobj.pool                         10          .   In Pool
MEMPOOL.busyobj.sz_wanted                 65536          .   Size requested
MEMPOOL.busyobj.sz_actual                 65504          .   Size allocated
MEMPOOL.busyobj.allocs                    27487         0.62 Allocations
MEMPOOL.busyobj.frees                     27487         0.62 Frees
MEMPOOL.busyobj.recycle                   27487         0.62 Recycled from pool
MEMPOOL.busyobj.timeout                    2917         0.07 Timed out from pool
MEMPOOL.busyobj.toosmall                      0         0.00 Too small to recycle
MEMPOOL.busyobj.surplus                       0         0.00 Too many for pool
MEMPOOL.busyobj.randry                        0         0.00 Pool ran dry
MEMPOOL.req0.live                            14          .   In use
MEMPOOL.req0.pool                            10          .   In Pool
MEMPOOL.req0.sz_wanted                    65536          .   Size requested
MEMPOOL.req0.sz_actual                    65504          .   Size allocated
MEMPOOL.req0.allocs                       55326         1.25 Allocations
MEMPOOL.req0.frees                        55312         1.25 Frees
MEMPOOL.req0.recycle                      55326         1.25 Recycled from pool
MEMPOOL.req0.timeout                       5710         0.13 Timed out from pool
MEMPOOL.req0.toosmall                         0         0.00 Too small to recycle
MEMPOOL.req0.surplus                          0         0.00 Too many for pool
MEMPOOL.req0.randry                           0         0.00 Pool ran dry
MEMPOOL.sess0.live                           14          .   In use
MEMPOOL.sess0.pool                           10          .   In Pool
MEMPOOL.sess0.sz_wanted                     768          .   Size requested
MEMPOOL.sess0.sz_actual                     736          .   Size allocated
MEMPOOL.sess0.allocs                      55326         1.25 Allocations
MEMPOOL.sess0.frees                       55312         1.25 Frees
MEMPOOL.sess0.recycle                     55326         1.25 Recycled from pool
MEMPOOL.sess0.timeout                      5824         0.13 Timed out from pool
MEMPOOL.sess0.toosmall                        0         0.00 Too small to recycle
MEMPOOL.sess0.surplus                         0         0.00 Too many for pool
MEMPOOL.sess0.randry                          0         0.00 Pool ran dry
LCK.sma.creat                                 2         0.00 Created locks
LCK.sma.destroy                               0         0.00 Destroyed locks
LCK.sma.locks                            175140         3.96 Lock Operations
LCK.sma.dbg_busy                              0         0.00 Contended lock operations
LCK.sma.dbg_try_fail                          0         0.00 Contended trylock operations
SMA.s0.c_req                              97594         2.21 Allocator requests
SMA.s0.c_fail                             69535         1.57 Allocator failures
SMA.s0.c_bytes                     110826315545   2508461.00 Bytes allocated
SMA.s0.c_freed                      88877383260   2011665.27 Bytes freed
SMA.s0.g_alloc                             5577          .   Allocations outstanding
SMA.s0.g_bytes                      21948932285          .   Bytes outstanding
SMA.s0.g_space                       3820871491          .   Bytes available
SMA.Transient.c_req                       33041         0.75 Allocator requests
SMA.Transient.c_fail                          0         0.00 Allocator failures
SMA.Transient.c_bytes                 600259959     13586.38 Bytes allocated
SMA.Transient.c_freed                 595088927     13469.34 Bytes freed
SMA.Transient.g_alloc                     11018          .   Allocations outstanding
SMA.Transient.g_bytes                   5171032          .   Bytes outstanding
SMA.Transient.g_space                         0          .   Bytes available
MEMPOOL.req1.live                            10          .   In use
MEMPOOL.req1.pool                            10          .   In Pool
MEMPOOL.req1.sz_wanted                    65536          .   Size requested
MEMPOOL.req1.sz_actual                    65504          .   Size allocated
MEMPOOL.req1.allocs                       55342         1.25 Allocations
MEMPOOL.req1.frees                        55332         1.25 Frees
MEMPOOL.req1.recycle                      55342         1.25 Recycled from pool
MEMPOOL.req1.timeout                       5733         0.13 Timed out from pool
MEMPOOL.req1.toosmall                         0         0.00 Too small to recycle
MEMPOOL.req1.surplus                          0         0.00 Too many for pool
MEMPOOL.req1.randry                           0         0.00 Pool ran dry
MEMPOOL.sess1.live                           10          .   In use
MEMPOOL.sess1.pool                           10          .   In Pool
MEMPOOL.sess1.sz_wanted                     768          .   Size requested
MEMPOOL.sess1.sz_actual                     736          .   Size allocated
MEMPOOL.sess1.allocs                      55342         1.25 Allocations
MEMPOOL.sess1.frees                       55332         1.25 Frees
MEMPOOL.sess1.recycle                     55342         1.25 Recycled from pool
MEMPOOL.sess1.timeout                      5882         0.13 Timed out from pool
MEMPOOL.sess1.toosmall                        0         0.00 Too small to recycle
MEMPOOL.sess1.surplus                         0         0.00 Too many for pool
MEMPOOL.sess1.randry                          0         0.00 Pool ran dry
VBE.boot.backend_example1_www.happy               3          .   Happy health probes
VBE.boot.backend_example1_www.bereq_hdrbytes            0         0.00 Request header bytes
VBE.boot.backend_example1_www.bereq_bodybytes            0         0.00 Request body bytes
VBE.boot.backend_example1_www.beresp_hdrbytes            0         0.00 Response header bytes
VBE.boot.backend_example1_www.beresp_bodybytes            0         0.00 Response body bytes
VBE.boot.backend_example1_www.pipe_hdrbytes               0         0.00 Pipe request header bytes
VBE.boot.backend_example1_www.pipe_out                    0         0.00 Piped bytes to backend
VBE.boot.backend_example1_www.pipe_in                     0         0.00 Piped bytes from backend
VBE.boot.backend_example1_www.conn                        0          .   Concurrent connections to backend
VBE.boot.backend_example1_www.req                         0         0.00 Backend requests sent
VBE.boot.backend_example1_www.unhealthy                   0         0.00 Fetches not attempted due to backend being unhealthy
VBE.boot.backend_example1_www.busy                        0         0.00 Fetches not attempted due to backend being busy
VBE.boot.backend_example1_www.fail                        0         0.00 Connections failed
VBE.boot.backend_example1_www.fail_eacces                 0         0.00 Connections failed with EACCES or EPERM
VBE.boot.backend_example1_www.fail_eaddrnotavail            0         0.00 Connections failed with EADDRNOTAVAIL
VBE.boot.backend_example1_www.fail_econnrefused             0         0.00 Connections failed with ECONNREFUSED
VBE.boot.backend_example1_www.fail_enetunreach              0         0.00 Connections failed with ENETUNREACH
VBE.boot.backend_example1_www.fail_etimedout                0         0.00 Connections failed ETIMEDOUT
VBE.boot.backend_example1_www.fail_other                    0         0.00 Connections failed for other reason
VBE.boot.backend_example1_www.helddown                      0         0.00 Connection opens not attempted
VBE.boot.backend_example1_staging_www.happy                 3          .   Happy health probes
VBE.boot.backend_example1_staging_www.bereq_hdrbytes            0         0.00 Request header bytes
VBE.boot.backend_example1_staging_www.bereq_bodybytes            0         0.00 Request body bytes
VBE.boot.backend_example1_staging_www.beresp_hdrbytes            0         0.00 Response header bytes
VBE.boot.backend_example1_staging_www.beresp_bodybytes            0         0.00 Response body bytes
VBE.boot.backend_example1_staging_www.pipe_hdrbytes               0         0.00 Pipe request header bytes
VBE.boot.backend_example1_staging_www.pipe_out                    0         0.00 Piped bytes to backend
VBE.boot.backend_example1_staging_www.pipe_in                     0         0.00 Piped bytes from backend
VBE.boot.backend_example1_staging_www.conn                        0          .   Concurrent connections to backend
VBE.boot.backend_example1_staging_www.req                         0         0.00 Backend requests sent
VBE.boot.backend_example1_staging_www.unhealthy                   0         0.00 Fetches not attempted due to backend being unhealthy
VBE.boot.backend_example1_staging_www.busy                        0         0.00 Fetches not attempted due to backend being busy
VBE.boot.backend_example1_staging_www.fail                        0         0.00 Connections failed
VBE.boot.backend_example1_staging_www.fail_eacces                 0         0.00 Connections failed with EACCES or EPERM
VBE.boot.backend_example1_staging_www.fail_eaddrnotavail            0         0.00 Connections failed with EADDRNOTAVAIL
VBE.boot.backend_example1_staging_www.fail_econnrefused             0         0.00 Connections failed with ECONNREFUSED
VBE.boot.backend_example1_staging_www.fail_enetunreach              0         0.00 Connections failed with ENETUNREACH
VBE.boot.backend_example1_staging_www.fail_etimedout                0         0.00 Connections failed ETIMEDOUT
VBE.boot.backend_example1_staging_www.fail_other                    0         0.00 Connections failed for other reason
VBE.boot.backend_example1_staging_www.helddown                      0         0.00 Connection opens not attempted
VBE.reload_20210129_145927_21294.backend_example1_www.happy            3          .   Happy health probes
VBE.reload_20210129_145927_21294.backend_example1_www.bereq_hdrbytes            0         0.00 Request header bytes
VBE.reload_20210129_145927_21294.backend_example1_www.bereq_bodybytes            0         0.00 Request body bytes
VBE.reload_20210129_145927_21294.backend_example1_www.beresp_hdrbytes            0         0.00 Response header bytes
VBE.reload_20210129_145927_21294.backend_example1_www.beresp_bodybytes            0         0.00 Response body bytes
VBE.reload_20210129_145927_21294.backend_example1_www.pipe_hdrbytes               0         0.00 Pipe request header bytes
VBE.reload_20210129_145927_21294.backend_example1_www.pipe_out                    0         0.00 Piped bytes to backend
VBE.reload_20210129_145927_21294.backend_example1_www.pipe_in                     0         0.00 Piped bytes from backend
VBE.reload_20210129_145927_21294.backend_example1_www.conn                        0          .   Concurrent connections to backend
VBE.reload_20210129_145927_21294.backend_example1_www.req                         0         0.00 Backend requests sent
VBE.reload_20210129_145927_21294.backend_example1_www.unhealthy                   0         0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210129_145927_21294.backend_example1_www.busy                        0         0.00 Fetches not attempted due to backend being busy
VBE.reload_20210129_145927_21294.backend_example1_www.fail                        0         0.00 Connections failed
VBE.reload_20210129_145927_21294.backend_example1_www.fail_eacces                 0         0.00 Connections failed with EACCES or EPERM
VBE.reload_20210129_145927_21294.backend_example1_www.fail_eaddrnotavail            0         0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210129_145927_21294.backend_example1_www.fail_econnrefused             0         0.00 Connections failed with ECONNREFUSED
VBE.reload_20210129_145927_21294.backend_example1_www.fail_enetunreach              0         0.00 Connections failed with ENETUNREACH
VBE.reload_20210129_145927_21294.backend_example1_www.fail_etimedout                0         0.00 Connections failed ETIMEDOUT
VBE.reload_20210129_145927_21294.backend_example1_www.fail_other                    0         0.00 Connections failed for other reason
VBE.reload_20210129_145927_21294.backend_example1_www.helddown                      0         0.00 Connection opens not attempted
VBE.reload_20210129_145927_21294.backend_example1_staging_www.happy                 3          .   Happy health probes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.bereq_hdrbytes            0         0.00 Request header bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.bereq_bodybytes            0         0.00 Request body bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.beresp_hdrbytes            0         0.00 Response header bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.beresp_bodybytes            0         0.00 Response body bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.pipe_hdrbytes               0         0.00 Pipe request header bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.pipe_out                    0         0.00 Piped bytes to backend
VBE.reload_20210129_145927_21294.backend_example1_staging_www.pipe_in                     0         0.00 Piped bytes from backend
VBE.reload_20210129_145927_21294.backend_example1_staging_www.conn                        0          .   Concurrent connections to backend
VBE.reload_20210129_145927_21294.backend_example1_staging_www.req                         0         0.00 Backend requests sent
VBE.reload_20210129_145927_21294.backend_example1_staging_www.unhealthy                   0         0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210129_145927_21294.backend_example1_staging_www.busy                        0         0.00 Fetches not attempted due to backend being busy
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail                        0         0.00 Connections failed
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_eacces                 0         0.00 Connections failed with EACCES or EPERM
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_eaddrnotavail            0         0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_econnrefused             0         0.00 Connections failed with ECONNREFUSED
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_enetunreach              0         0.00 Connections failed with ENETUNREACH
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_etimedout                0         0.00 Connections failed ETIMEDOUT
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_other                    0         0.00 Connections failed for other reason
VBE.reload_20210129_145927_21294.backend_example1_staging_www.helddown                      0         0.00 Connection opens not attempted
VBE.reload_20210201_103812_901.backend_example1_www.happy                        18446744073709551615          .   Happy health probes
VBE.reload_20210201_103812_901.backend_example1_www.bereq_hdrbytes                    1009663        22.85 Request header bytes
VBE.reload_20210201_103812_901.backend_example1_www.bereq_bodybytes                       386         0.01 Request body bytes
VBE.reload_20210201_103812_901.backend_example1_www.beresp_hdrbytes                    402996         9.12 Response header bytes
VBE.reload_20210201_103812_901.backend_example1_www.beresp_bodybytes                 59710414      1351.50 Response body bytes
VBE.reload_20210201_103812_901.backend_example1_www.pipe_hdrbytes                           0         0.00 Pipe request header bytes
VBE.reload_20210201_103812_901.backend_example1_www.pipe_out                                0         0.00 Piped bytes to backend
VBE.reload_20210201_103812_901.backend_example1_www.pipe_in                                 0         0.00 Piped bytes from backend
VBE.reload_20210201_103812_901.backend_example1_www.conn                                    0          .   Concurrent connections to backend
VBE.reload_20210201_103812_901.backend_example1_www.req                                  1321         0.03 Backend requests sent
VBE.reload_20210201_103812_901.backend_example1_www.unhealthy                               0         0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_103812_901.backend_example1_www.busy                                    0         0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_103812_901.backend_example1_www.fail                                    0         0.00 Connections failed
VBE.reload_20210201_103812_901.backend_example1_www.fail_eacces                             0         0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_103812_901.backend_example1_www.fail_eaddrnotavail                      0         0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_103812_901.backend_example1_www.fail_econnrefused                       0         0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_103812_901.backend_example1_www.fail_enetunreach                        0         0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_103812_901.backend_example1_www.fail_etimedout                          0         0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_103812_901.backend_example1_www.fail_other                              0         0.00 Connections failed for other reason
VBE.reload_20210201_103812_901.backend_example1_www.helddown                                0         0.00 Connection opens not attempted
VBE.reload_20210201_103812_901.backend_example1_staging_www.happy                18446744073709551615          .   Happy health probes
VBE.reload_20210201_103812_901.backend_example1_staging_www.bereq_hdrbytes                  0         0.00 Request header bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.bereq_bodybytes                 0         0.00 Request body bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.beresp_hdrbytes                 0         0.00 Response header bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.beresp_bodybytes                0         0.00 Response body bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.pipe_hdrbytes                   0         0.00 Pipe request header bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.pipe_out                        0         0.00 Piped bytes to backend
VBE.reload_20210201_103812_901.backend_example1_staging_www.pipe_in                         0         0.00 Piped bytes from backend
VBE.reload_20210201_103812_901.backend_example1_staging_www.conn                            0          .   Concurrent connections to backend
VBE.reload_20210201_103812_901.backend_example1_staging_www.req                             0         0.00 Backend requests sent
VBE.reload_20210201_103812_901.backend_example1_staging_www.unhealthy                       0         0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_103812_901.backend_example1_staging_www.busy                            0         0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail                            0         0.00 Connections failed
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_eacces                     0         0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_eaddrnotavail              0         0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_econnrefused               0         0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_enetunreach                0         0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_etimedout                  0         0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_other                      0         0.00 Connections failed for other reason
VBE.reload_20210201_103812_901.backend_example1_staging_www.helddown                        0         0.00 Connection opens not attempted

I will now try to disable THP and figure out how it behaves then. In case it's the combination of THP with jemalloc (which is the default for the official packages it seems), wouldn't it make sense to add a warning / hint to the https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#malloc documentation? Would this be an expected behavior or a bug in jemalloc or THP? Also interesting would be, does it matter whether it's a AMD or Intel CPU/machine?

nigoroll commented 3 years ago

Reopen: should we doc this?

idl0r commented 3 years ago

Update: I've disabled THP by using "transparent_hugepage=never" as Kernel boot parameter.

# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]

So "never" is selected.

Varnish is configured with "-s malloc,24G". Varnish currently has about 100G.

varnishstat:

MGT.uptime               76427         1.00 Management process uptime
MGT.child_start              1         0.00 Child process started
MGT.child_exit               0         0.00 Child process normal exit
MGT.child_stop               0         0.00 Child process unexpected exit
MGT.child_died               0         0.00 Child process died (signal)
MGT.child_dump               0         0.00 Child process core dumped
MGT.child_panic              0         0.00 Child process panic
MAIN.summs              742672         9.72 stat summ operations
MAIN.uptime              76428         1.00 Child process uptime
MAIN.sess_conn          279475         3.66 Sessions accepted
MAIN.sess_fail               0         0.00 Session accept failures
MAIN.sess_fail_econnaborted            0         0.00 Session accept failures: connection aborted
MAIN.sess_fail_eintr                   0         0.00 Session accept failures: interrupted system call
MAIN.sess_fail_emfile                  0         0.00 Session accept failures: too many open files
MAIN.sess_fail_ebadf                   0         0.00 Session accept failures: bad file descriptor
MAIN.sess_fail_enomem                  0         0.00 Session accept failures: not enough memory
MAIN.sess_fail_other                   0         0.00 Session accept failures: other
MAIN.client_req_400                    0         0.00 Client requests received, subject to 400 errors
MAIN.client_req_417                    0         0.00 Client requests received, subject to 417 errors
MAIN.client_req                   279521         3.66 Good client requests received
MAIN.cache_hit                    235759         3.08 Cache hits
MAIN.cache_hit_grace               14127         0.18 Cache grace hits
MAIN.cache_hitpass                     0         0.00 Cache hits for pass.
MAIN.cache_hitmiss                 22505         0.29 Cache hits for miss.
MAIN.cache_miss                    43365         0.57 Cache misses
MAIN.backend_conn                   2558         0.03 Backend conn. success
MAIN.backend_unhealthy                 0         0.00 Backend conn. not attempted
MAIN.backend_busy                      0         0.00 Backend conn. too many
MAIN.backend_fail                      0         0.00 Backend conn. failures
MAIN.backend_reuse                 55196         0.72 Backend conn. reuses
MAIN.backend_recycle               57684         0.75 Backend conn. recycles
MAIN.backend_retry                     0         0.00 Backend conn. retry
MAIN.fetch_head                        0         0.00 Fetch no body (HEAD)
MAIN.fetch_length                  46077         0.60 Fetch with Length
MAIN.fetch_chunked                   381         0.00 Fetch chunked
MAIN.fetch_eof                         0         0.00 Fetch EOF
MAIN.fetch_bad                         0         0.00 Fetch bad T-E
MAIN.fetch_none                     1749         0.02 Fetch no body
MAIN.fetch_1xx                         0         0.00 Fetch no body (1xx)
MAIN.fetch_204                         0         0.00 Fetch no body (204)
MAIN.fetch_304                      9547         0.12 Fetch no body (304)
MAIN.fetch_failed                     89         0.00 Fetch failed (all causes)
MAIN.fetch_no_thread                   0         0.00 Fetch failed (no thread)
MAIN.pools                             2          .   Number of thread pools
MAIN.threads                         200          .   Total number of threads
MAIN.threads_limited                   0         0.00 Threads hit max
MAIN.threads_created                 200         0.00 Threads created
MAIN.threads_destroyed                 0         0.00 Threads destroyed
MAIN.threads_failed                    0         0.00 Thread creation failed
MAIN.thread_queue_len                  0          .   Length of session queue
MAIN.busy_sleep                       46         0.00 Number of requests sent to sleep on busy objhdr
MAIN.busy_wakeup                      46         0.00 Number of requests woken after sleep on busy objhdr
MAIN.busy_killed                       0         0.00 Number of requests killed after sleep on busy objhdr
MAIN.sess_queued                       0         0.00 Sessions queued for thread
MAIN.sess_dropped                      0         0.00 Sessions dropped for thread
MAIN.req_dropped                       0         0.00 Requests dropped
MAIN.n_object                      26969          .   object structs made
MAIN.n_vampireobject                   0          .   unresurrected objects
MAIN.n_objectcore                  27002          .   objectcore structs made
MAIN.n_objecthead                  12858          .   objecthead structs made
MAIN.n_backend                        26          .   Number of backends
MAIN.n_expired                         0         0.00 Number of expired objects
MAIN.n_lru_nuked                    5437         0.07 Number of LRU nuked objects
MAIN.n_lru_moved                  165399         2.16 Number of LRU moved objects
MAIN.n_lru_limited                    89         0.00 Reached nuke_limit
MAIN.losthdr                           0         0.00 HTTP header overflows
MAIN.s_sess                       279475         3.66 Total sessions seen
MAIN.n_pipe                            0          .   Number of ongoing pipe sessions
MAIN.pipe_limited                      0         0.00 Pipes hit pipe_sess_max
MAIN.s_pipe                            0         0.00 Total pipe sessions seen
MAIN.s_pass                          351         0.00 Total pass-ed requests seen
MAIN.s_fetch                       43716         0.57 Total backend fetches initiated
MAIN.s_synth                           0         0.00 Total synthetic responses made
MAIN.s_req_hdrbytes            189075879      2473.91 Request header bytes
MAIN.s_req_bodybytes            36900569       482.81 Request body bytes
MAIN.s_resp_hdrbytes           112573679      1472.94 Response header bytes
MAIN.s_resp_bodybytes       998539651124  13065102.46 Response body bytes
MAIN.s_pipe_hdrbytes                   0         0.00 Pipe request header bytes
MAIN.s_pipe_in                         0         0.00 Piped bytes from client
MAIN.s_pipe_out                        0         0.00 Piped bytes to client
MAIN.sess_closed                  279475         3.66 Session Closed
MAIN.sess_closed_err               49037         0.64 Session Closed with error
MAIN.sess_readahead                    0         0.00 Session Read Ahead
MAIN.sess_herd                         0         0.00 Session herd
MAIN.sc_rem_close                   6173         0.08 Session OK  REM_CLOSE
MAIN.sc_req_close                 223989         2.93 Session OK  REQ_CLOSE
MAIN.sc_req_http10                  4072         0.05 Session Err REQ_HTTP10
MAIN.sc_rx_bad                         0         0.00 Session Err RX_BAD
MAIN.sc_rx_body                        0         0.00 Session Err RX_BODY
MAIN.sc_rx_junk                        0         0.00 Session Err RX_JUNK
MAIN.sc_rx_overflow                    0         0.00 Session Err RX_OVERFLOW
MAIN.sc_rx_timeout                     0         0.00 Session Err RX_TIMEOUT
MAIN.sc_rx_close_idle                  0         0.00 Session Err RX_CLOSE_IDLE
MAIN.sc_tx_pipe                        0         0.00 Session OK  TX_PIPE
MAIN.sc_tx_error                   44963         0.59 Session Err TX_ERROR
MAIN.sc_tx_eof                       253         0.00 Session OK  TX_EOF
MAIN.sc_resp_close                     0         0.00 Session OK  RESP_CLOSE
MAIN.sc_overload                       0         0.00 Session Err OVERLOAD
MAIN.sc_pipe_overflow                  0         0.00 Session Err PIPE_OVERFLOW
MAIN.sc_range_short                    2         0.00 Session Err RANGE_SHORT
MAIN.sc_req_http20                     0         0.00 Session Err REQ_HTTP20
MAIN.sc_vcl_failure                    0         0.00 Session Err VCL_FAILURE
MAIN.client_resp_500                   0         0.00 Delivery failed due to insufficient workspace.
MAIN.ws_backend_overflow               0         0.00 workspace_backend overflows
MAIN.ws_client_overflow                0         0.00 workspace_client overflows
MAIN.ws_thread_overflow                0         0.00 workspace_thread overflows
MAIN.ws_session_overflow               0         0.00 workspace_session overflows
MAIN.shm_records                26237719       343.30 SHM records
MAIN.shm_writes                  2173392        28.44 SHM writes
MAIN.shm_flushes                     155         0.00 SHM flushes due to overflow
MAIN.shm_cont                       6709         0.09 SHM MTX contention
MAIN.shm_cycles                       12         0.00 SHM cycles through buffer
MAIN.backend_req                   57754         0.76 Backend requests made
MAIN.n_vcl                            13          .   Number of loaded VCLs in total
MAIN.n_vcl_avail                      13          .   Number of VCLs available
MAIN.n_vcl_discard                     0          .   Number of discarded VCLs
MAIN.vcl_fail                          0         0.00 VCL failures
MAIN.bans                              1          .   Count of bans
MAIN.bans_completed                    1          .   Number of bans marked 'completed'
MAIN.bans_obj                          0          .   Number of bans using obj.*
MAIN.bans_req                          0          .   Number of bans using req.*
MAIN.bans_added                        1         0.00 Bans added
MAIN.bans_deleted                      0         0.00 Bans deleted
MAIN.bans_tested                       0         0.00 Bans tested against objects (lookup)
MAIN.bans_obj_killed                   0         0.00 Objects killed by bans (lookup)
MAIN.bans_lurker_tested                0         0.00 Bans tested against objects (lurker)
MAIN.bans_tests_tested                 0         0.00 Ban tests tested against objects (lookup)
MAIN.bans_lurker_tests_tested            0         0.00 Ban tests tested against objects (lurker)
MAIN.bans_lurker_obj_killed              0         0.00 Objects killed by bans (lurker)
MAIN.bans_lurker_obj_killed_cutoff            0         0.00 Objects killed by bans for cutoff (lurker)
MAIN.bans_dups                                0         0.00 Bans superseded by other bans
MAIN.bans_lurker_contention                   0         0.00 Lurker gave way for lookup
MAIN.bans_persisted_bytes                    16          .   Bytes used by the persisted ban lists
MAIN.bans_persisted_fragmentation             0          .   Extra bytes in persisted ban lists due to fragmentation
MAIN.n_purges                                 0         0.00 Number of purge operations executed
MAIN.n_obj_purged                             0         0.00 Number of purged objects
MAIN.exp_mailed                           87837         1.15 Number of objects mailed to expiry thread
MAIN.exp_received                         87837         1.15 Number of objects received by expiry thread
MAIN.hcb_nolock                          279124         3.65 HCB Lookups without lock
MAIN.hcb_lock                             18157         0.24 HCB Lookups with lock
MAIN.hcb_insert                           18157         0.24 HCB Inserts
MAIN.esi_errors                               0         0.00 ESI parse errors (unlock)
MAIN.esi_warnings                             0         0.00 ESI parse warnings (unlock)
MAIN.vmods                                    1          .   Loaded VMODs
MAIN.n_gzip                                   0         0.00 Gzip operations
MAIN.n_gunzip                              1280         0.02 Gunzip operations
MAIN.n_test_gunzip                        21825         0.29 Test gunzip operations
LCK.backend.creat                            26         0.00 Created locks
LCK.backend.destroy                           0         0.00 Destroyed locks
LCK.backend.locks                        115508         1.51 Lock Operations
LCK.backend.dbg_busy                          0         0.00 Contended lock operations
LCK.backend.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.ban.creat                                 1         0.00 Created locks
LCK.ban.destroy                               0         0.00 Destroyed locks
LCK.ban.locks                             90925         1.19 Lock Operations
LCK.ban.dbg_busy                              0         0.00 Contended lock operations
LCK.ban.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.busyobj.creat                         58099         0.76 Created locks
LCK.busyobj.destroy                       58067         0.76 Destroyed locks
LCK.busyobj.locks                       6509092        85.17 Lock Operations
LCK.busyobj.dbg_busy                          0         0.00 Contended lock operations
LCK.busyobj.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.cli.creat                                 1         0.00 Created locks
LCK.cli.destroy                               0         0.00 Destroyed locks
LCK.cli.locks                             25546         0.33 Lock Operations
LCK.cli.dbg_busy                              0         0.00 Contended lock operations
LCK.cli.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.exp.creat                                 1         0.00 Created locks
LCK.exp.destroy                               0         0.00 Destroyed locks
LCK.exp.locks                            323795         4.24 Lock Operations
LCK.exp.dbg_busy                              0         0.00 Contended lock operations
LCK.exp.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.hcb.creat                                 1         0.00 Created locks
LCK.hcb.destroy                               0         0.00 Destroyed locks
LCK.hcb.locks                             24067         0.31 Lock Operations
LCK.hcb.dbg_busy                              0         0.00 Contended lock operations
LCK.hcb.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.lru.creat                                 2         0.00 Created locks
LCK.lru.destroy                               0         0.00 Destroyed locks
LCK.lru.locks                            258673         3.38 Lock Operations
LCK.lru.dbg_busy                              0         0.00 Contended lock operations
LCK.lru.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.mempool.creat                             5         0.00 Created locks
LCK.mempool.destroy                           0         0.00 Destroyed locks
LCK.mempool.locks                       1636269        21.41 Lock Operations
LCK.mempool.dbg_busy                          0         0.00 Contended lock operations
LCK.mempool.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.objhdr.creat                          18213         0.24 Created locks
LCK.objhdr.destroy                         5355         0.07 Destroyed locks
LCK.objhdr.locks                        1198300        15.68 Lock Operations
LCK.objhdr.dbg_busy                           0         0.00 Contended lock operations
LCK.objhdr.dbg_try_fail                       0         0.00 Contended trylock operations
LCK.perpool.creat                             2         0.00 Created locks
LCK.perpool.destroy                           0         0.00 Destroyed locks
LCK.perpool.locks                       1114732        14.59 Lock Operations
LCK.perpool.dbg_busy                          0         0.00 Contended lock operations
LCK.perpool.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.pipestat.creat                            1         0.00 Created locks
LCK.pipestat.destroy                          0         0.00 Destroyed locks
LCK.pipestat.locks                            0         0.00 Lock Operations
LCK.pipestat.dbg_busy                         0         0.00 Contended lock operations
LCK.pipestat.dbg_try_fail                     0         0.00 Contended trylock operations
LCK.probe.creat                               1         0.00 Created locks
LCK.probe.destroy                             0         0.00 Destroyed locks
LCK.probe.locks                          136397         1.78 Lock Operations
LCK.probe.dbg_busy                            0         0.00 Contended lock operations
LCK.probe.dbg_try_fail                        0         0.00 Contended trylock operations
LCK.sess.creat                           279475         3.66 Created locks
LCK.sess.destroy                         279472         3.66 Destroyed locks
LCK.sess.locks                           395296         5.17 Lock Operations
LCK.sess.dbg_busy                             0         0.00 Contended lock operations
LCK.sess.dbg_try_fail                         0         0.00 Contended trylock operations
LCK.tcp_pool.creat                            3         0.00 Created locks
LCK.tcp_pool.destroy                          0         0.00 Destroyed locks
LCK.tcp_pool.locks                       228444         2.99 Lock Operations
LCK.tcp_pool.dbg_busy                         0         0.00 Contended lock operations
LCK.tcp_pool.dbg_try_fail                     0         0.00 Contended trylock operations
LCK.vbe.creat                                 1         0.00 Created locks
LCK.vbe.destroy                               0         0.00 Destroyed locks
LCK.vbe.locks                             59699         0.78 Lock Operations
LCK.vbe.dbg_busy                              0         0.00 Contended lock operations
LCK.vbe.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.vcapace.creat                             1         0.00 Created locks
LCK.vcapace.destroy                           0         0.00 Destroyed locks
LCK.vcapace.locks                             0         0.00 Lock Operations
LCK.vcapace.dbg_busy                          0         0.00 Contended lock operations
LCK.vcapace.dbg_try_fail                      0         0.00 Contended trylock operations
LCK.vcl.creat                                 1         0.00 Created locks
LCK.vcl.destroy                               0         0.00 Destroyed locks
LCK.vcl.locks                            120257         1.57 Lock Operations
LCK.vcl.dbg_busy                              0         0.00 Contended lock operations
LCK.vcl.dbg_try_fail                          0         0.00 Contended trylock operations
LCK.vxid.creat                                1         0.00 Created locks
LCK.vxid.destroy                              0         0.00 Destroyed locks
LCK.vxid.locks                               59         0.00 Lock Operations
LCK.vxid.dbg_busy                             0         0.00 Contended lock operations
LCK.vxid.dbg_try_fail                         0         0.00 Contended trylock operations
LCK.waiter.creat                              2         0.00 Created locks
LCK.waiter.destroy                            0         0.00 Destroyed locks
LCK.waiter.locks                         209639         2.74 Lock Operations
LCK.waiter.dbg_busy                           0         0.00 Contended lock operations
LCK.waiter.dbg_try_fail                       0         0.00 Contended trylock operations
LCK.wq.creat                                  1         0.00 Created locks
LCK.wq.destroy                                0         0.00 Destroyed locks
LCK.wq.locks                             110685         1.45 Lock Operations
LCK.wq.dbg_busy                               0         0.00 Contended lock operations
LCK.wq.dbg_try_fail                           0         0.00 Contended trylock operations
LCK.wstat.creat                               1         0.00 Created locks
LCK.wstat.destroy                             0         0.00 Destroyed locks
LCK.wstat.locks                          447231         5.85 Lock Operations
LCK.wstat.dbg_busy                            0         0.00 Contended lock operations
LCK.wstat.dbg_try_fail                        0         0.00 Contended trylock operations
MEMPOOL.busyobj.live                          0          .   In use
MEMPOOL.busyobj.pool                         10          .   In Pool
MEMPOOL.busyobj.sz_wanted                 65536          .   Size requested
MEMPOOL.busyobj.sz_actual                 65504          .   Size allocated
MEMPOOL.busyobj.allocs                    57754         0.76 Allocations
MEMPOOL.busyobj.frees                     57754         0.76 Frees
MEMPOOL.busyobj.recycle                   57754         0.76 Recycled from pool
MEMPOOL.busyobj.timeout                    5429         0.07 Timed out from pool
MEMPOOL.busyobj.toosmall                      0         0.00 Too small to recycle
MEMPOOL.busyobj.surplus                       0         0.00 Too many for pool
MEMPOOL.busyobj.randry                        0         0.00 Pool ran dry
MEMPOOL.req0.live                             7          .   In use
MEMPOOL.req0.pool                            10          .   In Pool
MEMPOOL.req0.sz_wanted                    65536          .   Size requested
MEMPOOL.req0.sz_actual                    65504          .   Size allocated
MEMPOOL.req0.allocs                      139735         1.83 Allocations
MEMPOOL.req0.frees                       139728         1.83 Frees
MEMPOOL.req0.recycle                     139735         1.83 Recycled from pool
MEMPOOL.req0.timeout                      15351         0.20 Timed out from pool
MEMPOOL.req0.toosmall                         0         0.00 Too small to recycle
MEMPOOL.req0.surplus                          0         0.00 Too many for pool
MEMPOOL.req0.randry                           0         0.00 Pool ran dry
MEMPOOL.sess0.live                            7          .   In use
MEMPOOL.sess0.pool                           10          .   In Pool
MEMPOOL.sess0.sz_wanted                     768          .   Size requested
MEMPOOL.sess0.sz_actual                     736          .   Size allocated
MEMPOOL.sess0.allocs                     139735         1.83 Allocations
MEMPOOL.sess0.frees                      139728         1.83 Frees
MEMPOOL.sess0.recycle                    139735         1.83 Recycled from pool
MEMPOOL.sess0.timeout                     15545         0.20 Timed out from pool
MEMPOOL.sess0.toosmall                        0         0.00 Too small to recycle
MEMPOOL.sess0.surplus                         0         0.00 Too many for pool
MEMPOOL.sess0.randry                          0         0.00 Pool ran dry
LCK.sma.creat                                 2         0.00 Created locks
LCK.sma.destroy                               0         0.00 Destroyed locks
LCK.sma.locks                            419436         5.49 Lock Operations
LCK.sma.dbg_busy                              0         0.00 Contended lock operations
LCK.sma.dbg_try_fail                          0         0.00 Contended trylock operations
SMA.s0.c_req                             247646         3.24 Allocator requests
SMA.s0.c_fail                            184958         2.42 Allocator failures
SMA.s0.c_bytes                     277654852719   3632894.39 Bytes allocated
SMA.s0.c_freed                     252830291671   3308084.62 Bytes freed
SMA.s0.g_alloc                             7705          .   Allocations outstanding
SMA.s0.g_bytes                      24824561048          .   Bytes outstanding
SMA.s0.g_space                        945242728          .   Bytes available
MEMPOOL.req1.live                             3          .   In use
MEMPOOL.req1.pool                            10          .   In Pool
MEMPOOL.req1.sz_wanted                    65536          .   Size requested
MEMPOOL.req1.sz_actual                    65504          .   Size allocated
MEMPOOL.req1.allocs                      139750         1.83 Allocations
MEMPOOL.req1.frees                       139747         1.83 Frees
MEMPOOL.req1.recycle                     139750         1.83 Recycled from pool
MEMPOOL.req1.timeout                      15237         0.20 Timed out from pool
MEMPOOL.req1.toosmall                         0         0.00 Too small to recycle
MEMPOOL.req1.surplus                          0         0.00 Too many for pool
MEMPOOL.req1.randry                           0         0.00 Pool ran dry
SMA.Transient.c_req                       70289         0.92 Allocator requests
SMA.Transient.c_fail                          0         0.00 Allocator failures
SMA.Transient.c_bytes                 495325314      6480.94 Bytes allocated
SMA.Transient.c_freed                 484306754      6336.77 Bytes freed
SMA.Transient.g_alloc                     23771          .   Allocations outstanding
SMA.Transient.g_bytes                  11018560          .   Bytes outstanding
SMA.Transient.g_space                         0          .   Bytes available
MEMPOOL.sess1.live                            3          .   In use
MEMPOOL.sess1.pool                           10          .   In Pool
MEMPOOL.sess1.sz_wanted                     768          .   Size requested
MEMPOOL.sess1.sz_actual                     736          .   Size allocated
MEMPOOL.sess1.allocs                     139750         1.83 Allocations
MEMPOOL.sess1.frees                      139747         1.83 Frees
MEMPOOL.sess1.recycle                    139750         1.83 Recycled from pool
MEMPOOL.sess1.timeout                     15520         0.20 Timed out from pool
MEMPOOL.sess1.toosmall                        0         0.00 Too small to recycle
MEMPOOL.sess1.surplus                         0         0.00 Too many for pool
MEMPOOL.sess1.randry                          0         0.00 Pool ran dry
VBE.reload_20210201_164825_46758.backend_example1_www.happy 18446744073709551615          .   Happy health probes
VBE.reload_20210201_164825_46758.backend_example1_www.bereq_hdrbytes     22529513       294.78 Request header bytes
VBE.reload_20210201_164825_46758.backend_example1_www.bereq_bodybytes        15432         0.20 Request body bytes
VBE.reload_20210201_164825_46758.backend_example1_www.beresp_hdrbytes     12154726       159.03 Response header bytes
VBE.reload_20210201_164825_46758.backend_example1_www.beresp_bodybytes 102136624051   1336377.04 Response body bytes
VBE.reload_20210201_164825_46758.backend_example1_www.pipe_hdrbytes               0         0.00 Pipe request header bytes
VBE.reload_20210201_164825_46758.backend_example1_www.pipe_out                    0         0.00 Piped bytes to backend
VBE.reload_20210201_164825_46758.backend_example1_www.pipe_in                     0         0.00 Piped bytes from backend
VBE.reload_20210201_164825_46758.backend_example1_www.conn                        0          .   Concurrent connections to backend
VBE.reload_20210201_164825_46758.backend_example1_www.req                     35534         0.46 Backend requests sent
VBE.reload_20210201_164825_46758.backend_example1_www.unhealthy                   0         0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_164825_46758.backend_example1_www.busy                        0         0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_164825_46758.backend_example1_www.fail                        0         0.00 Connections failed
VBE.reload_20210201_164825_46758.backend_example1_www.fail_eacces                 0         0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_164825_46758.backend_example1_www.fail_eaddrnotavail            0         0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_164825_46758.backend_example1_www.fail_econnrefused             0         0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_164825_46758.backend_example1_www.fail_enetunreach              0         0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_164825_46758.backend_example1_www.fail_etimedout                0         0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_164825_46758.backend_example1_www.fail_other                    0         0.00 Connections failed for other reason
VBE.reload_20210201_164825_46758.backend_example1_www.helddown                      0         0.00 Connection opens not attempted
VBE.reload_20210201_164825_46758.backend_example1_staging_www.happy      18446744073709551615          .   Happy health probes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.bereq_hdrbytes            0         0.00 Request header bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.bereq_bodybytes            0         0.00 Request body bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.beresp_hdrbytes            0         0.00 Response header bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.beresp_bodybytes            0         0.00 Response body bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.pipe_hdrbytes               0         0.00 Pipe request header bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.pipe_out                    0         0.00 Piped bytes to backend
VBE.reload_20210201_164825_46758.backend_example1_staging_www.pipe_in                     0         0.00 Piped bytes from backend
VBE.reload_20210201_164825_46758.backend_example1_staging_www.conn                        0          .   Concurrent connections to backend
VBE.reload_20210201_164825_46758.backend_example1_staging_www.req                         0         0.00 Backend requests sent
VBE.reload_20210201_164825_46758.backend_example1_staging_www.unhealthy                   0         0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_164825_46758.backend_example1_staging_www.busy                        0         0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail                        0         0.00 Connections failed
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_eacces                 0         0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_eaddrnotavail            0         0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_econnrefused             0         0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_enetunreach              0         0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_etimedout                0         0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_other                    0         0.00 Connections failed for other reason
VBE.reload_20210201_164825_46758.backend_example1_staging_www.helddown                      0         0.00 Connection opens not attempted

ram

Would you confirm that I properly disabled THP and that this huge amount of memory consumption seems not related to THP? Is this unexpected, a bug? Do you think it could matter whether it's a AMD vs Intel host?

nigoroll commented 3 years ago

@idl0r This is at least very interesting and, based on what I said before, obviously a scenario I have not seen yet. I have to say that for our clients, globally disabling THP had not been an option, so the cases I have seen so far were solved by using MALLOC_CONF=thp:never (set as an environment variable when starting varnish or exported in a shell script containing the varnish command line). So maybe I was wrong assuming that globally disabling thp should have the same effect. That said

nigoroll commented 3 years ago

Oh, and one other aspect: Which object sizes to you serve? Do they vary a lot?

idl0r commented 3 years ago

We're having stock Debian libjemalloc2-5.1.0-3. I'll try to figure out whether that setting works with it already or not.

pgrep -f example1.vcl
2334
3355

grep -i ppid /proc/3355/status
PPid:   2334

pmap -XX 3355 &> pmap.txt

pmap.txt

I've also added a Intel machine there, to see how it behaves there. Just to rule that out.

The object sizes: I've parsed the log from yesterday and pick the top most 50 sizes (rounded up) and they are mostly <5MB, some >100MB ones where the largest of the top 50 is ~326MB. After the top 50 requests, the mean sizes are growing. I can provide more details if needed.

nigoroll commented 3 years ago

Thank you for the details @idl0r . Right now I need to prioritize other work, but I will look later. The object sizes you reported actually do vary quite a lot, so it could be we see "first order" fragmentation (while the thp/jemalloc suspicion would be "second order" outside varnish). To check if that is the case, you could try setting fetch_maxchunksize considerably smaller than the default, for example to 8MB (to accommodate "mostly <5MB" without additional impact).

rezan commented 3 years ago

I believe you need to downgrade to jemalloc 3.6.0.

You can find the sources here: https://github.com/jemalloc/jemalloc/tags

After you compile, you can preload it in like so:

export LD_PRELOAD=/home/user/jemalloc-3.6.0/lib/libjemalloc.so

You can use this vmod to confirm you are using 3.6.0: https://github.com/rezan/libvmod-jemalloc

Also, if you use this vmod with 5.1, it will provide some metrics which should point to what part of jemalloc is wasting so much memory. In the past, its been the thread caches.

idl0r commented 3 years ago

FYI: Intel vs. AMD doesn't matter. Both are affected. Currently I'm trying -p fetch_maxchunksize=8M with THP being enabled / default and no special jemalloc option. I also noticed that MALLOC_CONF=thp:never doesn't throw a warning or error but changing it to something invalid does. So it might work already with 5.1.0

idl0r commented 3 years ago

Quick update: The -p fetch_maxchunksize=8M alone doesn't seem to help. It's still growing. Currently at ~60G (-s malloc,24G). I'll try the MALLOC_CONF=thp:never tomorrow morning

idl0r commented 3 years ago
# grep -aPo 'MALLOC_CONF[^\0]+' /proc/685/environ 
MALLOC_CONF=thp:never

varnish is running with -s malloc,24G still. Without the maxchunksize parameter. It's currently at about 70G and it seems to grow further. I am not yet sure however, if jemalloc really does what we expect. I just verified it by doing this:

# MALLOC_CONF=thp:never LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 /bin/echo test
test

# MALLOC_CONF=thp:wrong LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 /bin/echo test
<jemalloc>: Invalid conf value: thp:wrong
test

According to https://github.com/jemalloc/jemalloc/commit/e4f090e8df5adf180662c5eeac2af214f9594de4 it's in 5.1.0 as well. I also looked into the tarball and it seems to be indeed present there. So thp:never should work.

varnishstat.txt pmap.txt

I prepared one host to use jemalloc 3.6.0, as @rezan proposed, to see how that behaves. Data should be available tomorrow morning (CET).

idl0r commented 3 years ago

Ok, jemalloc 3.6.0 works perfectly fine and as you would expect it to be. It was slightly above the 24G limit at peak, which probably is the overhead, but totally fine. Currently, it's below 24G at ~22G at the moment. The other host is again at ~86G at the moment.

So there seems to be something seriously wrong with jemalloc I'd say, or how Varnish uses it. I dunno. What do you think?

nigoroll commented 3 years ago

This partially confirms my expectation, the news is that globally disabling thp did not work for you. Regarding MALLOC_CONF=thp:never please note that I wrote previously

IIUC this only works for libjemalloc from 5.2.x

So if it did not work for you with 5.1.x, I'd take that as confirmation.

idl0r commented 3 years ago

At least it was implemented in 5.1.0 already, see the commit I linked. I'll build a 5.2.x and try it with and without the MALLOC_CONF

rezan commented 3 years ago

@idl0r If you can install the jemalloc vmod and capture the stats when you are at ~90GB of resident memory, that would be a very helpful next step.

idl0r commented 3 years ago

Thanks guys, I think the update alone solves the problem. jemalloc-3.6.0 was almost perfect as there wasn't much overhead, 5.2.1 also solves the problem, isn't perfect though. Both hosts I prepared are around 30G, with and without MALLOC_CONF=thp:never.

Would it make sense to add a warning / note somewhere, to warn for jemalloc >3.6.0 and <5.2.1? Since Varnish has official repos / packages, would it make sense to provide a up2date jemalloc package as well?

rezan commented 3 years ago

What was your final result with 5.2.1? How much did you configure in Varnish and how much was actually allocated?

idl0r commented 3 years ago

-s malloc,24G. It ended up with something between 30-32GB.

rezan commented 3 years ago

ok, thats about ~30% waste, not great, but better than 300%.

idl0r commented 3 years ago

With 3.6.0 it was even somewhat below the 24G limit.

rezan commented 3 years ago

There might be some tunables with 5.2 to get that number lower. You would have to start with that stats dump. The internals are completely different than 3.6.0, so this is uncharted territory for me.

bsdphk commented 3 years ago

Timing this out.