Closed idl0r closed 3 years ago
This sounds pretty much like one would expect Varnish to work in such circumstances.
The entire point of Varnish is to use the available resources to efficiently deliver the content and as such varnish sees all RAM and CPU as fair game.
uhm, I'm not sure if I get your point.
This sounds pretty much like one would expect Varnish to work in such circumstances.
What circumstances? That I try to use (and limit) RAM instead of file based storage, because it's fast and plenty available?
https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#malloc
Be aware that the size limitation only limits the actual storage and that the approximately 1k of memory per object, used for various internal structures, is included in the actual storage as well.
The size parameter specifies the maximum amount of memory varnishd will allocate. The size is assumed to be in bytes, unless followed by one of the following suffixes:
So with 128G available and 32G being set, even with overhead, shouldn't it stay below 128G? What is malloc then used for?
From support cases with our clients, I very much suspect you are hitting an issue related to both jemalloc and Transparent Huge Pages. This problem typically results in 4x overhead.
That said, this is still a support question and you are advised to resort to our support channels The issue template clearly point this out, but you have not even used it.
@nigoroll to be honest, I haven't thought that this is a support request and even now I don't get it. Either I completely misunderstood -s malloc or that seems to be a bug / leak at the first glance?
<!--- Things to check before you report a bug
- Is the issue you are seeing clearly an issue in varnish core or
could it be a support question? We do not use github issues for
support questions, please refer to
http://varnish-cache.org/support/index.html when in doubt.
I think this is a varnish core issue
- For panics (varnish crashes), bug reports are most useful if
- you are running recent code
- ideally master
- but at least the latest release of a supported version
- you got
- debuginfo packages installed when running binaries from
packages if these are available from the package source
you are using
- or have compiled with debug information whenever possible
(configure --enable-debugging-symbols)
There is no panic or crash from what I can see
If you have considered these recommendations, please go ahead and
follow this template
-->
<!--- Provide a general summary of the issue in the Title above -->
I did
-s malloc,32G shouldn't reach 128G, even with overhead, even if we'd say malloc *3
<!--- Did you check that there are no similar bug reports or pull requests? -->
<!---
If your panic happens in the child_sigsegv_handler function, look at the
backtrace to determine whether it is similar to another issue. When in
doubt, open a new one and it will be closed as a duplicate if needed.
-->
<!--- If you're describing a bug, tell us what should happen -->
<!--- If you're suggesting a change/improvement, tell us how it should work -->
<!---
If it's a packaging bug (including sysv or systemd services bugs) please
open an issue on varnishcache/pkg-varnish-cache instead.
-->
<!---
If it's a feature request, please start a thread on the misc list instead.
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
-->
It's not a feature request I did search the issues
## Current Behavior
<!--- If describing a bug, tell us what happens instead of the expected behavior -->
<!--- If suggesting a change/improvement, explain the difference from current behavior -->
I basically did?
## Possible Solution
<!--- Not obligatory, but suggest a fix/reason for the bug, -->
<!--- or ideas how to implement the addition or change -->
I have no idea
## Steps to Reproduce (for bugs)
<!--- Provide a link to a live example, or an unambiguous set of steps to -->
<!--- reproduce this bug. Include code to reproduce, if relevant -->
1.
2.
3.
4.
I provided the VCL and what I think might be causing it. The requests against the mp4 files. There is no test case to reproduce it (yet).
## Context
<!--- How has this issue affected you? What are you trying to accomplish? -->
<!--- Providing context helps us come up with a solution that is most useful in the real world -->
I did
## Your Environment
<!--- Include as many relevant details about the environme
nt you experienced the bug in -->
* Version used:
* Operating System and version:
* Source of binary packages used (if any)
I also did
I understand your point. And I tell you that
-s malloc,32G shouldn't reach 128G, even with overhead
is not necessarily true.
I will happily look at your report if you can reproduce it with THP disabled.
You seem to be right, that it takes 4x of the actual defined limit. We're reduced the limit to 24G an it's final size is around 96G. Also a varnishstat -1:
MGT.uptime 246186 5.57 Management process uptime
MGT.child_start 4 0.00 Child process started
MGT.child_exit 0 0.00 Child process normal exit
MGT.child_stop 0 0.00 Child process unexpected exit
MGT.child_died 3 0.00 Child process died (signal)
MGT.child_dump 0 0.00 Child process core dumped
MGT.child_panic 0 0.00 Child process panic
MAIN.summs 314494 7.12 stat summ operations
MAIN.uptime 44181 1.00 Child process uptime
MAIN.sess_conn 110644 2.50 Sessions accepted
MAIN.sess_fail 0 0.00 Session accept failures
MAIN.sess_fail_econnaborted 0 0.00 Session accept failures: connection aborted
MAIN.sess_fail_eintr 0 0.00 Session accept failures: interrupted system call
MAIN.sess_fail_emfile 0 0.00 Session accept failures: too many open files
MAIN.sess_fail_ebadf 0 0.00 Session accept failures: bad file descriptor
MAIN.sess_fail_enomem 0 0.00 Session accept failures: not enough memory
MAIN.sess_fail_other 0 0.00 Session accept failures: other
MAIN.client_req_400 5 0.00 Client requests received, subject to 400 errors
MAIN.client_req_417 0 0.00 Client requests received, subject to 417 errors
MAIN.client_req 110649 2.50 Good client requests received
MAIN.cache_hit 90221 2.04 Cache hits
MAIN.cache_hit_grace 7112 0.16 Cache grace hits
MAIN.cache_hitpass 0 0.00 Cache hits for pass.
MAIN.cache_hitmiss 9237 0.21 Cache hits for miss.
MAIN.cache_miss 19777 0.45 Cache misses
MAIN.backend_conn 1490 0.03 Backend conn. success
MAIN.backend_unhealthy 0 0.00 Backend conn. not attempted
MAIN.backend_busy 0 0.00 Backend conn. too many
MAIN.backend_fail 0 0.00 Backend conn. failures
MAIN.backend_reuse 25998 0.59 Backend conn. reuses
MAIN.backend_recycle 27452 0.62 Backend conn. recycles
MAIN.backend_retry 1 0.00 Backend conn. retry
MAIN.fetch_head 0 0.00 Fetch no body (HEAD)
MAIN.fetch_length 21880 0.50 Fetch with Length
MAIN.fetch_chunked 198 0.00 Fetch chunked
MAIN.fetch_eof 0 0.00 Fetch EOF
MAIN.fetch_bad 0 0.00 Fetch bad T-E
MAIN.fetch_none 740 0.02 Fetch no body
MAIN.fetch_1xx 0 0.00 Fetch no body (1xx)
MAIN.fetch_204 0 0.00 Fetch no body (204)
MAIN.fetch_304 4668 0.11 Fetch no body (304)
MAIN.fetch_failed 31 0.00 Fetch failed (all causes)
MAIN.fetch_no_thread 0 0.00 Fetch failed (no thread)
MAIN.pools 2 . Number of thread pools
MAIN.threads 200 . Total number of threads
MAIN.threads_limited 0 0.00 Threads hit max
MAIN.threads_created 200 0.00 Threads created
MAIN.threads_destroyed 0 0.00 Threads destroyed
MAIN.threads_failed 0 0.00 Thread creation failed
MAIN.thread_queue_len 0 . Length of session queue
MAIN.busy_sleep 10 0.00 Number of requests sent to sleep on busy objhdr
MAIN.busy_wakeup 10 0.00 Number of requests woken after sleep on busy objhdr
MAIN.busy_killed 0 0.00 Number of requests killed after sleep on busy objhdr
MAIN.sess_queued 0 0.00 Sessions queued for thread
MAIN.sess_dropped 0 0.00 Sessions dropped for thread
MAIN.req_dropped 0 0.00 Requests dropped
MAIN.n_object 13327 . object structs made
MAIN.n_vampireobject 0 . unresurrected objects
MAIN.n_objectcore 13355 . objectcore structs made
MAIN.n_objecthead 7745 . objecthead structs made
MAIN.n_backend 16 . Number of backends
MAIN.n_expired 1 0.00 Number of expired objects
MAIN.n_lru_nuked 1764 0.04 Number of LRU nuked objects
MAIN.n_lru_moved 70352 1.59 Number of LRU moved objects
MAIN.n_lru_limited 29 0.00 Reached nuke_limit
MAIN.losthdr 0 0.00 HTTP header overflows
MAIN.s_sess 110644 2.50 Total sessions seen
MAIN.n_pipe 0 . Number of ongoing pipe sessions
MAIN.pipe_limited 0 0.00 Pipes hit pipe_sess_max
MAIN.s_pipe 0 0.00 Total pipe sessions seen
MAIN.s_pass 641 0.01 Total pass-ed requests seen
MAIN.s_fetch 20418 0.46 Total backend fetches initiated
MAIN.s_synth 0 0.00 Total synthetic responses made
MAIN.s_req_hdrbytes 73789945 1670.17 Request header bytes
MAIN.s_req_bodybytes 132278 2.99 Request body bytes
MAIN.s_resp_hdrbytes 44405642 1005.08 Response header bytes
MAIN.s_resp_bodybytes 283485397345 6416454.98 Response body bytes
MAIN.s_pipe_hdrbytes 0 0.00 Pipe request header bytes
MAIN.s_pipe_in 0 0.00 Piped bytes from client
MAIN.s_pipe_out 0 0.00 Piped bytes to client
MAIN.sess_closed 110644 2.50 Session Closed
MAIN.sess_closed_err 15630 0.35 Session Closed with error
MAIN.sess_readahead 0 0.00 Session Read Ahead
MAIN.sess_herd 0 0.00 Session herd
MAIN.sc_rem_close 1710 0.04 Session OK REM_CLOSE
MAIN.sc_req_close 93134 2.11 Session OK REQ_CLOSE
MAIN.sc_req_http10 2342 0.05 Session Err REQ_HTTP10
MAIN.sc_rx_bad 0 0.00 Session Err RX_BAD
MAIN.sc_rx_body 0 0.00 Session Err RX_BODY
MAIN.sc_rx_junk 5 0.00 Session Err RX_JUNK
MAIN.sc_rx_overflow 0 0.00 Session Err RX_OVERFLOW
MAIN.sc_rx_timeout 0 0.00 Session Err RX_TIMEOUT
MAIN.sc_rx_close_idle 0 0.00 Session Err RX_CLOSE_IDLE
MAIN.sc_tx_pipe 0 0.00 Session OK TX_PIPE
MAIN.sc_tx_error 13283 0.30 Session Err TX_ERROR
MAIN.sc_tx_eof 158 0.00 Session OK TX_EOF
MAIN.sc_resp_close 0 0.00 Session OK RESP_CLOSE
MAIN.sc_overload 0 0.00 Session Err OVERLOAD
MAIN.sc_pipe_overflow 0 0.00 Session Err PIPE_OVERFLOW
MAIN.sc_range_short 0 0.00 Session Err RANGE_SHORT
MAIN.sc_req_http20 0 0.00 Session Err REQ_HTTP20
MAIN.sc_vcl_failure 0 0.00 Session Err VCL_FAILURE
MAIN.client_resp_500 0 0.00 Delivery failed due to insufficient workspace.
MAIN.ws_backend_overflow 0 0.00 workspace_backend overflows
MAIN.ws_client_overflow 0 0.00 workspace_client overflows
MAIN.ws_thread_overflow 0 0.00 workspace_thread overflows
MAIN.ws_session_overflow 0 0.00 workspace_session overflows
MAIN.shm_records 10584463 239.57 SHM records
MAIN.shm_writes 885079 20.03 SHM writes
MAIN.shm_flushes 58 0.00 SHM flushes due to overflow
MAIN.shm_cont 3611 0.08 SHM MTX contention
MAIN.shm_cycles 5 0.00 SHM cycles through buffer
MAIN.backend_req 27487 0.62 Backend requests made
MAIN.n_vcl 8 . Number of loaded VCLs in total
MAIN.n_vcl_avail 8 . Number of VCLs available
MAIN.n_vcl_discard 0 . Number of discarded VCLs
MAIN.vcl_fail 0 0.00 VCL failures
MAIN.bans 1 . Count of bans
MAIN.bans_completed 1 . Number of bans marked 'completed'
MAIN.bans_obj 0 . Number of bans using obj.*
MAIN.bans_req 0 . Number of bans using req.*
MAIN.bans_added 1 0.00 Bans added
MAIN.bans_deleted 0 0.00 Bans deleted
MAIN.bans_tested 0 0.00 Bans tested against objects (lookup)
MAIN.bans_obj_killed 0 0.00 Objects killed by bans (lookup)
MAIN.bans_lurker_tested 0 0.00 Bans tested against objects (lurker)
MAIN.bans_tests_tested 0 0.00 Ban tests tested against objects (lookup)
MAIN.bans_lurker_tests_tested 0 0.00 Ban tests tested against objects (lurker)
MAIN.bans_lurker_obj_killed 0 0.00 Objects killed by bans (lurker)
MAIN.bans_lurker_obj_killed_cutoff 0 0.00 Objects killed by bans for cutoff (lurker)
MAIN.bans_dups 0 0.00 Bans superseded by other bans
MAIN.bans_lurker_contention 0 0.00 Lurker gave way for lookup
MAIN.bans_persisted_bytes 16 . Bytes used by the persisted ban lists
MAIN.bans_persisted_fragmentation 0 . Extra bytes in persisted ban lists due to fragmentation
MAIN.n_purges 0 0.00 Number of purge operations executed
MAIN.n_obj_purged 0 0.00 Number of purged objects
MAIN.exp_mailed 40364 0.91 Number of objects mailed to expiry thread
MAIN.exp_received 40364 0.91 Number of objects received by expiry thread
MAIN.hcb_nolock 109998 2.49 HCB Lookups without lock
MAIN.hcb_lock 9481 0.21 HCB Lookups with lock
MAIN.hcb_insert 9481 0.21 HCB Inserts
MAIN.esi_errors 0 0.00 ESI parse errors (unlock)
MAIN.esi_warnings 0 0.00 ESI parse warnings (unlock)
MAIN.vmods 1 . Loaded VMODs
MAIN.n_gzip 0 0.00 Gzip operations
MAIN.n_gunzip 708 0.02 Gunzip operations
MAIN.n_test_gunzip 10403 0.24 Test gunzip operations
LCK.backend.creat 16 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 54976 1.24 Lock Operations
LCK.backend.dbg_busy 0 0.00 Contended lock operations
LCK.backend.dbg_try_fail 0 0.00 Contended trylock operations
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 42153 0.95 Lock Operations
LCK.ban.dbg_busy 0 0.00 Contended lock operations
LCK.ban.dbg_try_fail 0 0.00 Contended trylock operations
LCK.busyobj.creat 27743 0.63 Created locks
LCK.busyobj.destroy 27706 0.63 Destroyed locks
LCK.busyobj.locks 2428786 54.97 Lock Operations
LCK.busyobj.dbg_busy 0 0.00 Contended lock operations
LCK.busyobj.dbg_try_fail 0 0.00 Contended trylock operations
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 14765 0.33 Lock Operations
LCK.cli.dbg_busy 0 0.00 Contended lock operations
LCK.cli.dbg_try_fail 0 0.00 Contended trylock operations
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 149437 3.38 Lock Operations
LCK.exp.dbg_busy 0 0.00 Contended lock operations
LCK.exp.dbg_try_fail 0 0.00 Contended trylock operations
LCK.hcb.creat 1 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 11509 0.26 Lock Operations
LCK.hcb.dbg_busy 0 0.00 Contended lock operations
LCK.hcb.dbg_try_fail 0 0.00 Contended trylock operations
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 112481 2.55 Lock Operations
LCK.lru.dbg_busy 0 0.00 Contended lock operations
LCK.lru.dbg_try_fail 0 0.00 Contended trylock operations
LCK.mempool.creat 5 0.00 Created locks
LCK.mempool.destroy 0 0.00 Destroyed locks
LCK.mempool.locks 717072 16.23 Lock Operations
LCK.mempool.dbg_busy 0 0.00 Contended lock operations
LCK.mempool.dbg_try_fail 0 0.00 Contended trylock operations
LCK.objhdr.creat 9531 0.22 Created locks
LCK.objhdr.destroy 1782 0.04 Destroyed locks
LCK.objhdr.locks 520986 11.79 Lock Operations
LCK.objhdr.dbg_busy 0 0.00 Contended lock operations
LCK.objhdr.dbg_try_fail 0 0.00 Contended trylock operations
LCK.perpool.creat 2 0.00 Created locks
LCK.perpool.destroy 0 0.00 Destroyed locks
LCK.perpool.locks 472264 10.69 Lock Operations
LCK.perpool.dbg_busy 0 0.00 Contended lock operations
LCK.perpool.dbg_try_fail 0 0.00 Contended trylock operations
LCK.pipestat.creat 1 0.00 Created locks
LCK.pipestat.destroy 0 0.00 Destroyed locks
LCK.pipestat.locks 0 0.00 Lock Operations
LCK.pipestat.dbg_busy 0 0.00 Contended lock operations
LCK.pipestat.dbg_try_fail 0 0.00 Contended trylock operations
LCK.probe.creat 1 0.00 Created locks
LCK.probe.destroy 0 0.00 Destroyed locks
LCK.probe.locks 76501 1.73 Lock Operations
LCK.probe.dbg_busy 0 0.00 Contended lock operations
LCK.probe.dbg_try_fail 0 0.00 Contended trylock operations
LCK.sess.creat 110666 2.50 Created locks
LCK.sess.destroy 110644 2.50 Destroyed locks
LCK.sess.locks 165837 3.75 Lock Operations
LCK.sess.dbg_busy 0 0.00 Contended lock operations
LCK.sess.dbg_try_fail 0 0.00 Contended trylock operations
LCK.tcp_pool.creat 3 0.00 Created locks
LCK.tcp_pool.destroy 0 0.00 Destroyed locks
LCK.tcp_pool.locks 108459 2.45 Lock Operations
LCK.tcp_pool.dbg_busy 0 0.00 Contended lock operations
LCK.tcp_pool.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 33915 0.77 Lock Operations
LCK.vbe.dbg_busy 0 0.00 Contended lock operations
LCK.vbe.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vcapace.creat 1 0.00 Created locks
LCK.vcapace.destroy 0 0.00 Destroyed locks
LCK.vcapace.locks 0 0.00 Lock Operations
LCK.vcapace.dbg_busy 0 0.00 Contended lock operations
LCK.vcapace.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 56893 1.29 Lock Operations
LCK.vcl.dbg_busy 0 0.00 Contended lock operations
LCK.vcl.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vxid.creat 1 0.00 Created locks
LCK.vxid.destroy 0 0.00 Destroyed locks
LCK.vxid.locks 49 0.00 Lock Operations
LCK.vxid.dbg_busy 0 0.00 Contended lock operations
LCK.vxid.dbg_try_fail 0 0.00 Contended trylock operations
LCK.waiter.creat 2 0.00 Created locks
LCK.waiter.destroy 0 0.00 Destroyed locks
LCK.waiter.locks 100010 2.26 Lock Operations
LCK.waiter.dbg_busy 0 0.00 Contended lock operations
LCK.waiter.dbg_try_fail 0 0.00 Contended trylock operations
LCK.wq.creat 1 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 63485 1.44 Lock Operations
LCK.wq.dbg_busy 0 0.00 Contended lock operations
LCK.wq.dbg_try_fail 0 0.00 Contended trylock operations
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 192661 4.36 Lock Operations
LCK.wstat.dbg_busy 0 0.00 Contended lock operations
LCK.wstat.dbg_try_fail 0 0.00 Contended trylock operations
MEMPOOL.busyobj.live 0 . In use
MEMPOOL.busyobj.pool 10 . In Pool
MEMPOOL.busyobj.sz_wanted 65536 . Size requested
MEMPOOL.busyobj.sz_actual 65504 . Size allocated
MEMPOOL.busyobj.allocs 27487 0.62 Allocations
MEMPOOL.busyobj.frees 27487 0.62 Frees
MEMPOOL.busyobj.recycle 27487 0.62 Recycled from pool
MEMPOOL.busyobj.timeout 2917 0.07 Timed out from pool
MEMPOOL.busyobj.toosmall 0 0.00 Too small to recycle
MEMPOOL.busyobj.surplus 0 0.00 Too many for pool
MEMPOOL.busyobj.randry 0 0.00 Pool ran dry
MEMPOOL.req0.live 14 . In use
MEMPOOL.req0.pool 10 . In Pool
MEMPOOL.req0.sz_wanted 65536 . Size requested
MEMPOOL.req0.sz_actual 65504 . Size allocated
MEMPOOL.req0.allocs 55326 1.25 Allocations
MEMPOOL.req0.frees 55312 1.25 Frees
MEMPOOL.req0.recycle 55326 1.25 Recycled from pool
MEMPOOL.req0.timeout 5710 0.13 Timed out from pool
MEMPOOL.req0.toosmall 0 0.00 Too small to recycle
MEMPOOL.req0.surplus 0 0.00 Too many for pool
MEMPOOL.req0.randry 0 0.00 Pool ran dry
MEMPOOL.sess0.live 14 . In use
MEMPOOL.sess0.pool 10 . In Pool
MEMPOOL.sess0.sz_wanted 768 . Size requested
MEMPOOL.sess0.sz_actual 736 . Size allocated
MEMPOOL.sess0.allocs 55326 1.25 Allocations
MEMPOOL.sess0.frees 55312 1.25 Frees
MEMPOOL.sess0.recycle 55326 1.25 Recycled from pool
MEMPOOL.sess0.timeout 5824 0.13 Timed out from pool
MEMPOOL.sess0.toosmall 0 0.00 Too small to recycle
MEMPOOL.sess0.surplus 0 0.00 Too many for pool
MEMPOOL.sess0.randry 0 0.00 Pool ran dry
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 175140 3.96 Lock Operations
LCK.sma.dbg_busy 0 0.00 Contended lock operations
LCK.sma.dbg_try_fail 0 0.00 Contended trylock operations
SMA.s0.c_req 97594 2.21 Allocator requests
SMA.s0.c_fail 69535 1.57 Allocator failures
SMA.s0.c_bytes 110826315545 2508461.00 Bytes allocated
SMA.s0.c_freed 88877383260 2011665.27 Bytes freed
SMA.s0.g_alloc 5577 . Allocations outstanding
SMA.s0.g_bytes 21948932285 . Bytes outstanding
SMA.s0.g_space 3820871491 . Bytes available
SMA.Transient.c_req 33041 0.75 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 600259959 13586.38 Bytes allocated
SMA.Transient.c_freed 595088927 13469.34 Bytes freed
SMA.Transient.g_alloc 11018 . Allocations outstanding
SMA.Transient.g_bytes 5171032 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
MEMPOOL.req1.live 10 . In use
MEMPOOL.req1.pool 10 . In Pool
MEMPOOL.req1.sz_wanted 65536 . Size requested
MEMPOOL.req1.sz_actual 65504 . Size allocated
MEMPOOL.req1.allocs 55342 1.25 Allocations
MEMPOOL.req1.frees 55332 1.25 Frees
MEMPOOL.req1.recycle 55342 1.25 Recycled from pool
MEMPOOL.req1.timeout 5733 0.13 Timed out from pool
MEMPOOL.req1.toosmall 0 0.00 Too small to recycle
MEMPOOL.req1.surplus 0 0.00 Too many for pool
MEMPOOL.req1.randry 0 0.00 Pool ran dry
MEMPOOL.sess1.live 10 . In use
MEMPOOL.sess1.pool 10 . In Pool
MEMPOOL.sess1.sz_wanted 768 . Size requested
MEMPOOL.sess1.sz_actual 736 . Size allocated
MEMPOOL.sess1.allocs 55342 1.25 Allocations
MEMPOOL.sess1.frees 55332 1.25 Frees
MEMPOOL.sess1.recycle 55342 1.25 Recycled from pool
MEMPOOL.sess1.timeout 5882 0.13 Timed out from pool
MEMPOOL.sess1.toosmall 0 0.00 Too small to recycle
MEMPOOL.sess1.surplus 0 0.00 Too many for pool
MEMPOOL.sess1.randry 0 0.00 Pool ran dry
VBE.boot.backend_example1_www.happy 3 . Happy health probes
VBE.boot.backend_example1_www.bereq_hdrbytes 0 0.00 Request header bytes
VBE.boot.backend_example1_www.bereq_bodybytes 0 0.00 Request body bytes
VBE.boot.backend_example1_www.beresp_hdrbytes 0 0.00 Response header bytes
VBE.boot.backend_example1_www.beresp_bodybytes 0 0.00 Response body bytes
VBE.boot.backend_example1_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.boot.backend_example1_www.pipe_out 0 0.00 Piped bytes to backend
VBE.boot.backend_example1_www.pipe_in 0 0.00 Piped bytes from backend
VBE.boot.backend_example1_www.conn 0 . Concurrent connections to backend
VBE.boot.backend_example1_www.req 0 0.00 Backend requests sent
VBE.boot.backend_example1_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.boot.backend_example1_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.boot.backend_example1_www.fail 0 0.00 Connections failed
VBE.boot.backend_example1_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.boot.backend_example1_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.boot.backend_example1_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.boot.backend_example1_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.boot.backend_example1_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.boot.backend_example1_www.fail_other 0 0.00 Connections failed for other reason
VBE.boot.backend_example1_www.helddown 0 0.00 Connection opens not attempted
VBE.boot.backend_example1_staging_www.happy 3 . Happy health probes
VBE.boot.backend_example1_staging_www.bereq_hdrbytes 0 0.00 Request header bytes
VBE.boot.backend_example1_staging_www.bereq_bodybytes 0 0.00 Request body bytes
VBE.boot.backend_example1_staging_www.beresp_hdrbytes 0 0.00 Response header bytes
VBE.boot.backend_example1_staging_www.beresp_bodybytes 0 0.00 Response body bytes
VBE.boot.backend_example1_staging_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.boot.backend_example1_staging_www.pipe_out 0 0.00 Piped bytes to backend
VBE.boot.backend_example1_staging_www.pipe_in 0 0.00 Piped bytes from backend
VBE.boot.backend_example1_staging_www.conn 0 . Concurrent connections to backend
VBE.boot.backend_example1_staging_www.req 0 0.00 Backend requests sent
VBE.boot.backend_example1_staging_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.boot.backend_example1_staging_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.boot.backend_example1_staging_www.fail 0 0.00 Connections failed
VBE.boot.backend_example1_staging_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.boot.backend_example1_staging_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.boot.backend_example1_staging_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.boot.backend_example1_staging_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.boot.backend_example1_staging_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.boot.backend_example1_staging_www.fail_other 0 0.00 Connections failed for other reason
VBE.boot.backend_example1_staging_www.helddown 0 0.00 Connection opens not attempted
VBE.reload_20210129_145927_21294.backend_example1_www.happy 3 . Happy health probes
VBE.reload_20210129_145927_21294.backend_example1_www.bereq_hdrbytes 0 0.00 Request header bytes
VBE.reload_20210129_145927_21294.backend_example1_www.bereq_bodybytes 0 0.00 Request body bytes
VBE.reload_20210129_145927_21294.backend_example1_www.beresp_hdrbytes 0 0.00 Response header bytes
VBE.reload_20210129_145927_21294.backend_example1_www.beresp_bodybytes 0 0.00 Response body bytes
VBE.reload_20210129_145927_21294.backend_example1_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.reload_20210129_145927_21294.backend_example1_www.pipe_out 0 0.00 Piped bytes to backend
VBE.reload_20210129_145927_21294.backend_example1_www.pipe_in 0 0.00 Piped bytes from backend
VBE.reload_20210129_145927_21294.backend_example1_www.conn 0 . Concurrent connections to backend
VBE.reload_20210129_145927_21294.backend_example1_www.req 0 0.00 Backend requests sent
VBE.reload_20210129_145927_21294.backend_example1_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210129_145927_21294.backend_example1_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.reload_20210129_145927_21294.backend_example1_www.fail 0 0.00 Connections failed
VBE.reload_20210129_145927_21294.backend_example1_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.reload_20210129_145927_21294.backend_example1_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210129_145927_21294.backend_example1_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.reload_20210129_145927_21294.backend_example1_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.reload_20210129_145927_21294.backend_example1_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.reload_20210129_145927_21294.backend_example1_www.fail_other 0 0.00 Connections failed for other reason
VBE.reload_20210129_145927_21294.backend_example1_www.helddown 0 0.00 Connection opens not attempted
VBE.reload_20210129_145927_21294.backend_example1_staging_www.happy 3 . Happy health probes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.bereq_hdrbytes 0 0.00 Request header bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.bereq_bodybytes 0 0.00 Request body bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.beresp_hdrbytes 0 0.00 Response header bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.beresp_bodybytes 0 0.00 Response body bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.reload_20210129_145927_21294.backend_example1_staging_www.pipe_out 0 0.00 Piped bytes to backend
VBE.reload_20210129_145927_21294.backend_example1_staging_www.pipe_in 0 0.00 Piped bytes from backend
VBE.reload_20210129_145927_21294.backend_example1_staging_www.conn 0 . Concurrent connections to backend
VBE.reload_20210129_145927_21294.backend_example1_staging_www.req 0 0.00 Backend requests sent
VBE.reload_20210129_145927_21294.backend_example1_staging_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210129_145927_21294.backend_example1_staging_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail 0 0.00 Connections failed
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.reload_20210129_145927_21294.backend_example1_staging_www.fail_other 0 0.00 Connections failed for other reason
VBE.reload_20210129_145927_21294.backend_example1_staging_www.helddown 0 0.00 Connection opens not attempted
VBE.reload_20210201_103812_901.backend_example1_www.happy 18446744073709551615 . Happy health probes
VBE.reload_20210201_103812_901.backend_example1_www.bereq_hdrbytes 1009663 22.85 Request header bytes
VBE.reload_20210201_103812_901.backend_example1_www.bereq_bodybytes 386 0.01 Request body bytes
VBE.reload_20210201_103812_901.backend_example1_www.beresp_hdrbytes 402996 9.12 Response header bytes
VBE.reload_20210201_103812_901.backend_example1_www.beresp_bodybytes 59710414 1351.50 Response body bytes
VBE.reload_20210201_103812_901.backend_example1_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.reload_20210201_103812_901.backend_example1_www.pipe_out 0 0.00 Piped bytes to backend
VBE.reload_20210201_103812_901.backend_example1_www.pipe_in 0 0.00 Piped bytes from backend
VBE.reload_20210201_103812_901.backend_example1_www.conn 0 . Concurrent connections to backend
VBE.reload_20210201_103812_901.backend_example1_www.req 1321 0.03 Backend requests sent
VBE.reload_20210201_103812_901.backend_example1_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_103812_901.backend_example1_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_103812_901.backend_example1_www.fail 0 0.00 Connections failed
VBE.reload_20210201_103812_901.backend_example1_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_103812_901.backend_example1_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_103812_901.backend_example1_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_103812_901.backend_example1_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_103812_901.backend_example1_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_103812_901.backend_example1_www.fail_other 0 0.00 Connections failed for other reason
VBE.reload_20210201_103812_901.backend_example1_www.helddown 0 0.00 Connection opens not attempted
VBE.reload_20210201_103812_901.backend_example1_staging_www.happy 18446744073709551615 . Happy health probes
VBE.reload_20210201_103812_901.backend_example1_staging_www.bereq_hdrbytes 0 0.00 Request header bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.bereq_bodybytes 0 0.00 Request body bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.beresp_hdrbytes 0 0.00 Response header bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.beresp_bodybytes 0 0.00 Response body bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.reload_20210201_103812_901.backend_example1_staging_www.pipe_out 0 0.00 Piped bytes to backend
VBE.reload_20210201_103812_901.backend_example1_staging_www.pipe_in 0 0.00 Piped bytes from backend
VBE.reload_20210201_103812_901.backend_example1_staging_www.conn 0 . Concurrent connections to backend
VBE.reload_20210201_103812_901.backend_example1_staging_www.req 0 0.00 Backend requests sent
VBE.reload_20210201_103812_901.backend_example1_staging_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_103812_901.backend_example1_staging_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail 0 0.00 Connections failed
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_103812_901.backend_example1_staging_www.fail_other 0 0.00 Connections failed for other reason
VBE.reload_20210201_103812_901.backend_example1_staging_www.helddown 0 0.00 Connection opens not attempted
I will now try to disable THP and figure out how it behaves then. In case it's the combination of THP with jemalloc (which is the default for the official packages it seems), wouldn't it make sense to add a warning / hint to the https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#malloc documentation? Would this be an expected behavior or a bug in jemalloc or THP? Also interesting would be, does it matter whether it's a AMD or Intel CPU/machine?
Reopen: should we doc this?
Update: I've disabled THP by using "transparent_hugepage=never" as Kernel boot parameter.
# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
So "never" is selected.
Varnish is configured with "-s malloc,24G". Varnish currently has about 100G.
varnishstat:
MGT.uptime 76427 1.00 Management process uptime
MGT.child_start 1 0.00 Child process started
MGT.child_exit 0 0.00 Child process normal exit
MGT.child_stop 0 0.00 Child process unexpected exit
MGT.child_died 0 0.00 Child process died (signal)
MGT.child_dump 0 0.00 Child process core dumped
MGT.child_panic 0 0.00 Child process panic
MAIN.summs 742672 9.72 stat summ operations
MAIN.uptime 76428 1.00 Child process uptime
MAIN.sess_conn 279475 3.66 Sessions accepted
MAIN.sess_fail 0 0.00 Session accept failures
MAIN.sess_fail_econnaborted 0 0.00 Session accept failures: connection aborted
MAIN.sess_fail_eintr 0 0.00 Session accept failures: interrupted system call
MAIN.sess_fail_emfile 0 0.00 Session accept failures: too many open files
MAIN.sess_fail_ebadf 0 0.00 Session accept failures: bad file descriptor
MAIN.sess_fail_enomem 0 0.00 Session accept failures: not enough memory
MAIN.sess_fail_other 0 0.00 Session accept failures: other
MAIN.client_req_400 0 0.00 Client requests received, subject to 400 errors
MAIN.client_req_417 0 0.00 Client requests received, subject to 417 errors
MAIN.client_req 279521 3.66 Good client requests received
MAIN.cache_hit 235759 3.08 Cache hits
MAIN.cache_hit_grace 14127 0.18 Cache grace hits
MAIN.cache_hitpass 0 0.00 Cache hits for pass.
MAIN.cache_hitmiss 22505 0.29 Cache hits for miss.
MAIN.cache_miss 43365 0.57 Cache misses
MAIN.backend_conn 2558 0.03 Backend conn. success
MAIN.backend_unhealthy 0 0.00 Backend conn. not attempted
MAIN.backend_busy 0 0.00 Backend conn. too many
MAIN.backend_fail 0 0.00 Backend conn. failures
MAIN.backend_reuse 55196 0.72 Backend conn. reuses
MAIN.backend_recycle 57684 0.75 Backend conn. recycles
MAIN.backend_retry 0 0.00 Backend conn. retry
MAIN.fetch_head 0 0.00 Fetch no body (HEAD)
MAIN.fetch_length 46077 0.60 Fetch with Length
MAIN.fetch_chunked 381 0.00 Fetch chunked
MAIN.fetch_eof 0 0.00 Fetch EOF
MAIN.fetch_bad 0 0.00 Fetch bad T-E
MAIN.fetch_none 1749 0.02 Fetch no body
MAIN.fetch_1xx 0 0.00 Fetch no body (1xx)
MAIN.fetch_204 0 0.00 Fetch no body (204)
MAIN.fetch_304 9547 0.12 Fetch no body (304)
MAIN.fetch_failed 89 0.00 Fetch failed (all causes)
MAIN.fetch_no_thread 0 0.00 Fetch failed (no thread)
MAIN.pools 2 . Number of thread pools
MAIN.threads 200 . Total number of threads
MAIN.threads_limited 0 0.00 Threads hit max
MAIN.threads_created 200 0.00 Threads created
MAIN.threads_destroyed 0 0.00 Threads destroyed
MAIN.threads_failed 0 0.00 Thread creation failed
MAIN.thread_queue_len 0 . Length of session queue
MAIN.busy_sleep 46 0.00 Number of requests sent to sleep on busy objhdr
MAIN.busy_wakeup 46 0.00 Number of requests woken after sleep on busy objhdr
MAIN.busy_killed 0 0.00 Number of requests killed after sleep on busy objhdr
MAIN.sess_queued 0 0.00 Sessions queued for thread
MAIN.sess_dropped 0 0.00 Sessions dropped for thread
MAIN.req_dropped 0 0.00 Requests dropped
MAIN.n_object 26969 . object structs made
MAIN.n_vampireobject 0 . unresurrected objects
MAIN.n_objectcore 27002 . objectcore structs made
MAIN.n_objecthead 12858 . objecthead structs made
MAIN.n_backend 26 . Number of backends
MAIN.n_expired 0 0.00 Number of expired objects
MAIN.n_lru_nuked 5437 0.07 Number of LRU nuked objects
MAIN.n_lru_moved 165399 2.16 Number of LRU moved objects
MAIN.n_lru_limited 89 0.00 Reached nuke_limit
MAIN.losthdr 0 0.00 HTTP header overflows
MAIN.s_sess 279475 3.66 Total sessions seen
MAIN.n_pipe 0 . Number of ongoing pipe sessions
MAIN.pipe_limited 0 0.00 Pipes hit pipe_sess_max
MAIN.s_pipe 0 0.00 Total pipe sessions seen
MAIN.s_pass 351 0.00 Total pass-ed requests seen
MAIN.s_fetch 43716 0.57 Total backend fetches initiated
MAIN.s_synth 0 0.00 Total synthetic responses made
MAIN.s_req_hdrbytes 189075879 2473.91 Request header bytes
MAIN.s_req_bodybytes 36900569 482.81 Request body bytes
MAIN.s_resp_hdrbytes 112573679 1472.94 Response header bytes
MAIN.s_resp_bodybytes 998539651124 13065102.46 Response body bytes
MAIN.s_pipe_hdrbytes 0 0.00 Pipe request header bytes
MAIN.s_pipe_in 0 0.00 Piped bytes from client
MAIN.s_pipe_out 0 0.00 Piped bytes to client
MAIN.sess_closed 279475 3.66 Session Closed
MAIN.sess_closed_err 49037 0.64 Session Closed with error
MAIN.sess_readahead 0 0.00 Session Read Ahead
MAIN.sess_herd 0 0.00 Session herd
MAIN.sc_rem_close 6173 0.08 Session OK REM_CLOSE
MAIN.sc_req_close 223989 2.93 Session OK REQ_CLOSE
MAIN.sc_req_http10 4072 0.05 Session Err REQ_HTTP10
MAIN.sc_rx_bad 0 0.00 Session Err RX_BAD
MAIN.sc_rx_body 0 0.00 Session Err RX_BODY
MAIN.sc_rx_junk 0 0.00 Session Err RX_JUNK
MAIN.sc_rx_overflow 0 0.00 Session Err RX_OVERFLOW
MAIN.sc_rx_timeout 0 0.00 Session Err RX_TIMEOUT
MAIN.sc_rx_close_idle 0 0.00 Session Err RX_CLOSE_IDLE
MAIN.sc_tx_pipe 0 0.00 Session OK TX_PIPE
MAIN.sc_tx_error 44963 0.59 Session Err TX_ERROR
MAIN.sc_tx_eof 253 0.00 Session OK TX_EOF
MAIN.sc_resp_close 0 0.00 Session OK RESP_CLOSE
MAIN.sc_overload 0 0.00 Session Err OVERLOAD
MAIN.sc_pipe_overflow 0 0.00 Session Err PIPE_OVERFLOW
MAIN.sc_range_short 2 0.00 Session Err RANGE_SHORT
MAIN.sc_req_http20 0 0.00 Session Err REQ_HTTP20
MAIN.sc_vcl_failure 0 0.00 Session Err VCL_FAILURE
MAIN.client_resp_500 0 0.00 Delivery failed due to insufficient workspace.
MAIN.ws_backend_overflow 0 0.00 workspace_backend overflows
MAIN.ws_client_overflow 0 0.00 workspace_client overflows
MAIN.ws_thread_overflow 0 0.00 workspace_thread overflows
MAIN.ws_session_overflow 0 0.00 workspace_session overflows
MAIN.shm_records 26237719 343.30 SHM records
MAIN.shm_writes 2173392 28.44 SHM writes
MAIN.shm_flushes 155 0.00 SHM flushes due to overflow
MAIN.shm_cont 6709 0.09 SHM MTX contention
MAIN.shm_cycles 12 0.00 SHM cycles through buffer
MAIN.backend_req 57754 0.76 Backend requests made
MAIN.n_vcl 13 . Number of loaded VCLs in total
MAIN.n_vcl_avail 13 . Number of VCLs available
MAIN.n_vcl_discard 0 . Number of discarded VCLs
MAIN.vcl_fail 0 0.00 VCL failures
MAIN.bans 1 . Count of bans
MAIN.bans_completed 1 . Number of bans marked 'completed'
MAIN.bans_obj 0 . Number of bans using obj.*
MAIN.bans_req 0 . Number of bans using req.*
MAIN.bans_added 1 0.00 Bans added
MAIN.bans_deleted 0 0.00 Bans deleted
MAIN.bans_tested 0 0.00 Bans tested against objects (lookup)
MAIN.bans_obj_killed 0 0.00 Objects killed by bans (lookup)
MAIN.bans_lurker_tested 0 0.00 Bans tested against objects (lurker)
MAIN.bans_tests_tested 0 0.00 Ban tests tested against objects (lookup)
MAIN.bans_lurker_tests_tested 0 0.00 Ban tests tested against objects (lurker)
MAIN.bans_lurker_obj_killed 0 0.00 Objects killed by bans (lurker)
MAIN.bans_lurker_obj_killed_cutoff 0 0.00 Objects killed by bans for cutoff (lurker)
MAIN.bans_dups 0 0.00 Bans superseded by other bans
MAIN.bans_lurker_contention 0 0.00 Lurker gave way for lookup
MAIN.bans_persisted_bytes 16 . Bytes used by the persisted ban lists
MAIN.bans_persisted_fragmentation 0 . Extra bytes in persisted ban lists due to fragmentation
MAIN.n_purges 0 0.00 Number of purge operations executed
MAIN.n_obj_purged 0 0.00 Number of purged objects
MAIN.exp_mailed 87837 1.15 Number of objects mailed to expiry thread
MAIN.exp_received 87837 1.15 Number of objects received by expiry thread
MAIN.hcb_nolock 279124 3.65 HCB Lookups without lock
MAIN.hcb_lock 18157 0.24 HCB Lookups with lock
MAIN.hcb_insert 18157 0.24 HCB Inserts
MAIN.esi_errors 0 0.00 ESI parse errors (unlock)
MAIN.esi_warnings 0 0.00 ESI parse warnings (unlock)
MAIN.vmods 1 . Loaded VMODs
MAIN.n_gzip 0 0.00 Gzip operations
MAIN.n_gunzip 1280 0.02 Gunzip operations
MAIN.n_test_gunzip 21825 0.29 Test gunzip operations
LCK.backend.creat 26 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 115508 1.51 Lock Operations
LCK.backend.dbg_busy 0 0.00 Contended lock operations
LCK.backend.dbg_try_fail 0 0.00 Contended trylock operations
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 90925 1.19 Lock Operations
LCK.ban.dbg_busy 0 0.00 Contended lock operations
LCK.ban.dbg_try_fail 0 0.00 Contended trylock operations
LCK.busyobj.creat 58099 0.76 Created locks
LCK.busyobj.destroy 58067 0.76 Destroyed locks
LCK.busyobj.locks 6509092 85.17 Lock Operations
LCK.busyobj.dbg_busy 0 0.00 Contended lock operations
LCK.busyobj.dbg_try_fail 0 0.00 Contended trylock operations
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 25546 0.33 Lock Operations
LCK.cli.dbg_busy 0 0.00 Contended lock operations
LCK.cli.dbg_try_fail 0 0.00 Contended trylock operations
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 323795 4.24 Lock Operations
LCK.exp.dbg_busy 0 0.00 Contended lock operations
LCK.exp.dbg_try_fail 0 0.00 Contended trylock operations
LCK.hcb.creat 1 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 24067 0.31 Lock Operations
LCK.hcb.dbg_busy 0 0.00 Contended lock operations
LCK.hcb.dbg_try_fail 0 0.00 Contended trylock operations
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 258673 3.38 Lock Operations
LCK.lru.dbg_busy 0 0.00 Contended lock operations
LCK.lru.dbg_try_fail 0 0.00 Contended trylock operations
LCK.mempool.creat 5 0.00 Created locks
LCK.mempool.destroy 0 0.00 Destroyed locks
LCK.mempool.locks 1636269 21.41 Lock Operations
LCK.mempool.dbg_busy 0 0.00 Contended lock operations
LCK.mempool.dbg_try_fail 0 0.00 Contended trylock operations
LCK.objhdr.creat 18213 0.24 Created locks
LCK.objhdr.destroy 5355 0.07 Destroyed locks
LCK.objhdr.locks 1198300 15.68 Lock Operations
LCK.objhdr.dbg_busy 0 0.00 Contended lock operations
LCK.objhdr.dbg_try_fail 0 0.00 Contended trylock operations
LCK.perpool.creat 2 0.00 Created locks
LCK.perpool.destroy 0 0.00 Destroyed locks
LCK.perpool.locks 1114732 14.59 Lock Operations
LCK.perpool.dbg_busy 0 0.00 Contended lock operations
LCK.perpool.dbg_try_fail 0 0.00 Contended trylock operations
LCK.pipestat.creat 1 0.00 Created locks
LCK.pipestat.destroy 0 0.00 Destroyed locks
LCK.pipestat.locks 0 0.00 Lock Operations
LCK.pipestat.dbg_busy 0 0.00 Contended lock operations
LCK.pipestat.dbg_try_fail 0 0.00 Contended trylock operations
LCK.probe.creat 1 0.00 Created locks
LCK.probe.destroy 0 0.00 Destroyed locks
LCK.probe.locks 136397 1.78 Lock Operations
LCK.probe.dbg_busy 0 0.00 Contended lock operations
LCK.probe.dbg_try_fail 0 0.00 Contended trylock operations
LCK.sess.creat 279475 3.66 Created locks
LCK.sess.destroy 279472 3.66 Destroyed locks
LCK.sess.locks 395296 5.17 Lock Operations
LCK.sess.dbg_busy 0 0.00 Contended lock operations
LCK.sess.dbg_try_fail 0 0.00 Contended trylock operations
LCK.tcp_pool.creat 3 0.00 Created locks
LCK.tcp_pool.destroy 0 0.00 Destroyed locks
LCK.tcp_pool.locks 228444 2.99 Lock Operations
LCK.tcp_pool.dbg_busy 0 0.00 Contended lock operations
LCK.tcp_pool.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 59699 0.78 Lock Operations
LCK.vbe.dbg_busy 0 0.00 Contended lock operations
LCK.vbe.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vcapace.creat 1 0.00 Created locks
LCK.vcapace.destroy 0 0.00 Destroyed locks
LCK.vcapace.locks 0 0.00 Lock Operations
LCK.vcapace.dbg_busy 0 0.00 Contended lock operations
LCK.vcapace.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 120257 1.57 Lock Operations
LCK.vcl.dbg_busy 0 0.00 Contended lock operations
LCK.vcl.dbg_try_fail 0 0.00 Contended trylock operations
LCK.vxid.creat 1 0.00 Created locks
LCK.vxid.destroy 0 0.00 Destroyed locks
LCK.vxid.locks 59 0.00 Lock Operations
LCK.vxid.dbg_busy 0 0.00 Contended lock operations
LCK.vxid.dbg_try_fail 0 0.00 Contended trylock operations
LCK.waiter.creat 2 0.00 Created locks
LCK.waiter.destroy 0 0.00 Destroyed locks
LCK.waiter.locks 209639 2.74 Lock Operations
LCK.waiter.dbg_busy 0 0.00 Contended lock operations
LCK.waiter.dbg_try_fail 0 0.00 Contended trylock operations
LCK.wq.creat 1 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 110685 1.45 Lock Operations
LCK.wq.dbg_busy 0 0.00 Contended lock operations
LCK.wq.dbg_try_fail 0 0.00 Contended trylock operations
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 447231 5.85 Lock Operations
LCK.wstat.dbg_busy 0 0.00 Contended lock operations
LCK.wstat.dbg_try_fail 0 0.00 Contended trylock operations
MEMPOOL.busyobj.live 0 . In use
MEMPOOL.busyobj.pool 10 . In Pool
MEMPOOL.busyobj.sz_wanted 65536 . Size requested
MEMPOOL.busyobj.sz_actual 65504 . Size allocated
MEMPOOL.busyobj.allocs 57754 0.76 Allocations
MEMPOOL.busyobj.frees 57754 0.76 Frees
MEMPOOL.busyobj.recycle 57754 0.76 Recycled from pool
MEMPOOL.busyobj.timeout 5429 0.07 Timed out from pool
MEMPOOL.busyobj.toosmall 0 0.00 Too small to recycle
MEMPOOL.busyobj.surplus 0 0.00 Too many for pool
MEMPOOL.busyobj.randry 0 0.00 Pool ran dry
MEMPOOL.req0.live 7 . In use
MEMPOOL.req0.pool 10 . In Pool
MEMPOOL.req0.sz_wanted 65536 . Size requested
MEMPOOL.req0.sz_actual 65504 . Size allocated
MEMPOOL.req0.allocs 139735 1.83 Allocations
MEMPOOL.req0.frees 139728 1.83 Frees
MEMPOOL.req0.recycle 139735 1.83 Recycled from pool
MEMPOOL.req0.timeout 15351 0.20 Timed out from pool
MEMPOOL.req0.toosmall 0 0.00 Too small to recycle
MEMPOOL.req0.surplus 0 0.00 Too many for pool
MEMPOOL.req0.randry 0 0.00 Pool ran dry
MEMPOOL.sess0.live 7 . In use
MEMPOOL.sess0.pool 10 . In Pool
MEMPOOL.sess0.sz_wanted 768 . Size requested
MEMPOOL.sess0.sz_actual 736 . Size allocated
MEMPOOL.sess0.allocs 139735 1.83 Allocations
MEMPOOL.sess0.frees 139728 1.83 Frees
MEMPOOL.sess0.recycle 139735 1.83 Recycled from pool
MEMPOOL.sess0.timeout 15545 0.20 Timed out from pool
MEMPOOL.sess0.toosmall 0 0.00 Too small to recycle
MEMPOOL.sess0.surplus 0 0.00 Too many for pool
MEMPOOL.sess0.randry 0 0.00 Pool ran dry
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 419436 5.49 Lock Operations
LCK.sma.dbg_busy 0 0.00 Contended lock operations
LCK.sma.dbg_try_fail 0 0.00 Contended trylock operations
SMA.s0.c_req 247646 3.24 Allocator requests
SMA.s0.c_fail 184958 2.42 Allocator failures
SMA.s0.c_bytes 277654852719 3632894.39 Bytes allocated
SMA.s0.c_freed 252830291671 3308084.62 Bytes freed
SMA.s0.g_alloc 7705 . Allocations outstanding
SMA.s0.g_bytes 24824561048 . Bytes outstanding
SMA.s0.g_space 945242728 . Bytes available
MEMPOOL.req1.live 3 . In use
MEMPOOL.req1.pool 10 . In Pool
MEMPOOL.req1.sz_wanted 65536 . Size requested
MEMPOOL.req1.sz_actual 65504 . Size allocated
MEMPOOL.req1.allocs 139750 1.83 Allocations
MEMPOOL.req1.frees 139747 1.83 Frees
MEMPOOL.req1.recycle 139750 1.83 Recycled from pool
MEMPOOL.req1.timeout 15237 0.20 Timed out from pool
MEMPOOL.req1.toosmall 0 0.00 Too small to recycle
MEMPOOL.req1.surplus 0 0.00 Too many for pool
MEMPOOL.req1.randry 0 0.00 Pool ran dry
SMA.Transient.c_req 70289 0.92 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 495325314 6480.94 Bytes allocated
SMA.Transient.c_freed 484306754 6336.77 Bytes freed
SMA.Transient.g_alloc 23771 . Allocations outstanding
SMA.Transient.g_bytes 11018560 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
MEMPOOL.sess1.live 3 . In use
MEMPOOL.sess1.pool 10 . In Pool
MEMPOOL.sess1.sz_wanted 768 . Size requested
MEMPOOL.sess1.sz_actual 736 . Size allocated
MEMPOOL.sess1.allocs 139750 1.83 Allocations
MEMPOOL.sess1.frees 139747 1.83 Frees
MEMPOOL.sess1.recycle 139750 1.83 Recycled from pool
MEMPOOL.sess1.timeout 15520 0.20 Timed out from pool
MEMPOOL.sess1.toosmall 0 0.00 Too small to recycle
MEMPOOL.sess1.surplus 0 0.00 Too many for pool
MEMPOOL.sess1.randry 0 0.00 Pool ran dry
VBE.reload_20210201_164825_46758.backend_example1_www.happy 18446744073709551615 . Happy health probes
VBE.reload_20210201_164825_46758.backend_example1_www.bereq_hdrbytes 22529513 294.78 Request header bytes
VBE.reload_20210201_164825_46758.backend_example1_www.bereq_bodybytes 15432 0.20 Request body bytes
VBE.reload_20210201_164825_46758.backend_example1_www.beresp_hdrbytes 12154726 159.03 Response header bytes
VBE.reload_20210201_164825_46758.backend_example1_www.beresp_bodybytes 102136624051 1336377.04 Response body bytes
VBE.reload_20210201_164825_46758.backend_example1_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.reload_20210201_164825_46758.backend_example1_www.pipe_out 0 0.00 Piped bytes to backend
VBE.reload_20210201_164825_46758.backend_example1_www.pipe_in 0 0.00 Piped bytes from backend
VBE.reload_20210201_164825_46758.backend_example1_www.conn 0 . Concurrent connections to backend
VBE.reload_20210201_164825_46758.backend_example1_www.req 35534 0.46 Backend requests sent
VBE.reload_20210201_164825_46758.backend_example1_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_164825_46758.backend_example1_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_164825_46758.backend_example1_www.fail 0 0.00 Connections failed
VBE.reload_20210201_164825_46758.backend_example1_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_164825_46758.backend_example1_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_164825_46758.backend_example1_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_164825_46758.backend_example1_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_164825_46758.backend_example1_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_164825_46758.backend_example1_www.fail_other 0 0.00 Connections failed for other reason
VBE.reload_20210201_164825_46758.backend_example1_www.helddown 0 0.00 Connection opens not attempted
VBE.reload_20210201_164825_46758.backend_example1_staging_www.happy 18446744073709551615 . Happy health probes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.bereq_hdrbytes 0 0.00 Request header bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.bereq_bodybytes 0 0.00 Request body bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.beresp_hdrbytes 0 0.00 Response header bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.beresp_bodybytes 0 0.00 Response body bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.pipe_hdrbytes 0 0.00 Pipe request header bytes
VBE.reload_20210201_164825_46758.backend_example1_staging_www.pipe_out 0 0.00 Piped bytes to backend
VBE.reload_20210201_164825_46758.backend_example1_staging_www.pipe_in 0 0.00 Piped bytes from backend
VBE.reload_20210201_164825_46758.backend_example1_staging_www.conn 0 . Concurrent connections to backend
VBE.reload_20210201_164825_46758.backend_example1_staging_www.req 0 0.00 Backend requests sent
VBE.reload_20210201_164825_46758.backend_example1_staging_www.unhealthy 0 0.00 Fetches not attempted due to backend being unhealthy
VBE.reload_20210201_164825_46758.backend_example1_staging_www.busy 0 0.00 Fetches not attempted due to backend being busy
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail 0 0.00 Connections failed
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_eacces 0 0.00 Connections failed with EACCES or EPERM
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_eaddrnotavail 0 0.00 Connections failed with EADDRNOTAVAIL
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_econnrefused 0 0.00 Connections failed with ECONNREFUSED
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_enetunreach 0 0.00 Connections failed with ENETUNREACH
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_etimedout 0 0.00 Connections failed ETIMEDOUT
VBE.reload_20210201_164825_46758.backend_example1_staging_www.fail_other 0 0.00 Connections failed for other reason
VBE.reload_20210201_164825_46758.backend_example1_staging_www.helddown 0 0.00 Connection opens not attempted
Would you confirm that I properly disabled THP and that this huge amount of memory consumption seems not related to THP? Is this unexpected, a bug? Do you think it could matter whether it's a AMD vs Intel host?
@idl0r This is at least very interesting and, based on what I said before, obviously a scenario I have not seen yet.
I have to say that for our clients, globally disabling THP had not been an option, so the cases I have seen so far were solved by using MALLOC_CONF=thp:never
(set as an environment variable when starting varnish or exported in a shell script containing the varnish command line). So maybe I was wrong assuming that globally disabling thp should have the same effect.
That said
pmap -XX $PID_OF_VARNISH_WORKER
here? $PID_OF_VARNISH_WORKER
would, of the two varnishd processes you see, be the pid of the one having the other as the ppid. The worker pid which we need is usually the higher of the two.MALLOC_CONF=thp:never
works for you instead of globally disabling thp. IIUC this only works for libjemalloc from 5.2.xOh, and one other aspect: Which object sizes to you serve? Do they vary a lot?
We're having stock Debian libjemalloc2-5.1.0-3. I'll try to figure out whether that setting works with it already or not.
pgrep -f example1.vcl
2334
3355
grep -i ppid /proc/3355/status
PPid: 2334
pmap -XX 3355 &> pmap.txt
I've also added a Intel machine there, to see how it behaves there. Just to rule that out.
The object sizes: I've parsed the log from yesterday and pick the top most 50 sizes (rounded up) and they are mostly <5MB, some >100MB ones where the largest of the top 50 is ~326MB. After the top 50 requests, the mean sizes are growing. I can provide more details if needed.
Thank you for the details @idl0r . Right now I need to prioritize other work, but I will look later. The object sizes you reported actually do vary quite a lot, so it could be we see "first order" fragmentation (while the thp/jemalloc suspicion would be "second order" outside varnish). To check if that is the case, you could try setting fetch_maxchunksize considerably smaller than the default, for example to 8MB (to accommodate "mostly <5MB" without additional impact).
I believe you need to downgrade to jemalloc 3.6.0.
You can find the sources here: https://github.com/jemalloc/jemalloc/tags
After you compile, you can preload it in like so:
export LD_PRELOAD=/home/user/jemalloc-3.6.0/lib/libjemalloc.so
You can use this vmod to confirm you are using 3.6.0: https://github.com/rezan/libvmod-jemalloc
Also, if you use this vmod with 5.1, it will provide some metrics which should point to what part of jemalloc is wasting so much memory. In the past, its been the thread caches.
FYI: Intel vs. AMD doesn't matter. Both are affected.
Currently I'm trying -p fetch_maxchunksize=8M
with THP being enabled / default and no special jemalloc option.
I also noticed that MALLOC_CONF=thp:never
doesn't throw a warning or error but changing it to something invalid does. So it might work already with 5.1.0
Quick update:
The -p fetch_maxchunksize=8M
alone doesn't seem to help. It's still growing. Currently at ~60G (-s malloc,24G).
I'll try the MALLOC_CONF=thp:never
tomorrow morning
# grep -aPo 'MALLOC_CONF[^\0]+' /proc/685/environ
MALLOC_CONF=thp:never
varnish is running with -s malloc,24G
still. Without the maxchunksize parameter.
It's currently at about 70G and it seems to grow further.
I am not yet sure however, if jemalloc really does what we expect. I just verified it by doing this:
# MALLOC_CONF=thp:never LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 /bin/echo test
test
# MALLOC_CONF=thp:wrong LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 /bin/echo test
<jemalloc>: Invalid conf value: thp:wrong
test
According to https://github.com/jemalloc/jemalloc/commit/e4f090e8df5adf180662c5eeac2af214f9594de4 it's in 5.1.0 as well. I also looked into the tarball and it seems to be indeed present there. So thp:never should work.
I prepared one host to use jemalloc 3.6.0, as @rezan proposed, to see how that behaves. Data should be available tomorrow morning (CET).
Ok, jemalloc 3.6.0 works perfectly fine and as you would expect it to be. It was slightly above the 24G limit at peak, which probably is the overhead, but totally fine. Currently, it's below 24G at ~22G at the moment. The other host is again at ~86G at the moment.
So there seems to be something seriously wrong with jemalloc I'd say, or how Varnish uses it. I dunno. What do you think?
This partially confirms my expectation, the news is that globally disabling thp did not work for you.
Regarding MALLOC_CONF=thp:never
please note that I wrote previously
IIUC this only works for libjemalloc from 5.2.x
So if it did not work for you with 5.1.x, I'd take that as confirmation.
At least it was implemented in 5.1.0 already, see the commit I linked. I'll build a 5.2.x and try it with and without the MALLOC_CONF
@idl0r If you can install the jemalloc vmod and capture the stats when you are at ~90GB of resident memory, that would be a very helpful next step.
Thanks guys, I think the update alone solves the problem.
jemalloc-3.6.0 was almost perfect as there wasn't much overhead, 5.2.1 also solves the problem, isn't perfect though.
Both hosts I prepared are around 30G, with and without MALLOC_CONF=thp:never
.
Would it make sense to add a warning / note somewhere, to warn for jemalloc >3.6.0 and <5.2.1? Since Varnish has official repos / packages, would it make sense to provide a up2date jemalloc package as well?
What was your final result with 5.2.1? How much did you configure in Varnish and how much was actually allocated?
-s malloc,24G
. It ended up with something between 30-32GB.
ok, thats about ~30% waste, not great, but better than 300%.
With 3.6.0 it was even somewhat below the 24G limit.
There might be some tunables with 5.2 to get that number lower. You would have to start with that stats dump. The internals are completely different than 3.6.0, so this is uncharted territory for me.
Timing this out.
Hi,
we're running 6.4.0-1~buster (Debian 10) from https://packagecloud.io/varnishcache The system has 128G RAM.
varnishd or better vcache user threads are very hungry and took ~92% of the system memory at the time writing this issue.
atop screenshot
It also looks like varnish kills or restarts the threads at some point: https://ibb.co/nCygwX0
Right now, due to a limited event, we're caching quite a lot .mp4 files, some may be around or above 1G.