varnish / varnish-modules

Collection of Varnish Cache modules (vmods) by Varnish Software
Other
182 stars 86 forks source link

Varnish child process restarting with saintmode active #54

Closed lifaguiar closed 7 years ago

lifaguiar commented 7 years ago

I have the following configuration: -Varnish 5.0.0 -CentOS Linux release 7.3.1611 (Core) -Varnish Modules: latest from github.

VCL

default.vcl
vcl 4.0;
import std;
import directors;
import saintmode;
import cookie;
include "./backends.vcl"; //<-- Here I defined the backends in director modules.
sub vcl_recv {
}
(...)
sub vcl_fetch {
       (...)
        //Saint Mode for 10 seconds
        if (beresp.status >= 500) {
                saintmode.blacklist(10s);
                return (retry);
        }
}

I realized that althought Varnish main cache process stills keeps running fine. The child process is restarted on every saintmode retry. If I comment the saint mode retry code, the child process stops to crash.

VCL panic from varnishadmin:

Panic at: Thu, 26 Jan 2017 00:26:07 GMT
"Assert error in vmod_blacklist(), vmod_saintmode.c line 107:
  Condition(((sm_objs)) != 0) not true.
thread = (cache-worker)
version = varnish-5.0.0 revision 99d036f
ident = Linux,3.10.0-514.2.2.el7.x86_64,x86_64,-jnone,-smalloc,-smalloc,-hcritbit,epoll
Backtrace:
  0x434d86: varnishd() [0x434d86]
  0x7f878d3e9763: libvmod_saintmode.so(vmod_blacklist+0x313) [0x7f878d3e9763]
  0x7f879c2ad32d: vgc.so(VGC_function_vcl_backend_response+0x25d) [0x7f879c2ad32d]
  0x441c4b: varnishd() [0x441c4b]
  0x44357d: varnishd(VCL_backend_response_method+0x5d) [0x44357d]
  0x4247e5: varnishd() [0x4247e5]
  0x44c631: varnishd() [0x44c631]
  0x44ca7b: varnishd() [0x44ca7b]
  0x7f87abd9ddc5: libpthread.so.0(+0x7dc5) [0x7f87abd9ddc5]
  0x7f87abacc73d: libc.so.6(clone+0x6d) [0x7f87abacc73d]
busyobj = 0x7f8778080960 {
  ws = 0x7f87780809e0 {
    id = \"bo\",
    {s, f, r, e} = {0x7f8778082898, +3304, (nil), +57504},
  },
  retries = 0, failed = 0, flags = {do_gzip, do_stream},
  http_conn = 0x7f8778082c40 {
    fd = 37,
    doclose = REQ_CLOSE,
    ws = 0x7f87780809e0,
    {rxbuf_b, rxbuf_e} = {0x7f8778082ce8, 0x7f8778082dc7},
    {pipeline_b, pipeline_e} = {0x7f8778082dc7, 0x7f87780834ef},
    content_length = 1832,
    body_status = length,
    first_byte_timeout = 60.000000,
    between_bytes_timeout = 60.000000,
  },
  director_req = 0x104ed00 {
    vcl_name = agile_80,
    type = round-robin {
    },
  },
  director_resp = 0x104cb78 {
    vcl_name = agile_80_backend_rbs_aaa_bb_80,
    type = backend {
      display_name = boot.agile_80_backend_rbs_aaa_bb_80,
      ipv4 = 10.240.64.74,
      port = 80,
      hosthdr = backend.rbs.aaa.bb,
      health = healthy,
      admin_health = probe, changed = 1485390360.440884,
      n_conn = 1,
    },
  },
  http[bereq] = 0x7f8778080fa8 {
    ws[bo] = 0x7f87780809e0,
    hdrs {
      \"GET\",
      \"/su/commands/getComments\",
      \"HTTP/1.1\",
      \"Host: aovivo.clicrbs.com.br\",
      \"Accept: */*\",
      \"X-Requested-With: XMLHttpRequest\",
      \"User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36\",
      \"Referer: http://myhost.domain.aaa.bb/uri/\",
      \"Accept-Language: pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4\",
      \"Cookie: JSESSIONID=0A3F9AF4D3EAF80FD50C6903A4782CF1; __mdplayer=0; __agileplayers=%5B%5D;\",
      \"X-Forwarded-For: 177.66.155.44\",
      \"ClientIP: 177.66.155.44\",
      \"X-UA-Device: pc\",
      \"X-Paywall: none\",
      \"X-Vary-Cookie: none\",
      \"X-Group: AGILE\",
      \"X-Backend-Found: true\",
      \"X-Varnish: 360461\",
      \"Accept-Encoding: identity\",
    },
  },
  http[beresp] = 0x7f8778081420 {
    ws[bo] = 0x7f87780809e0,
    hdrs {
      \"HTTP/1.1\",
      \"500\",
      \"Internal Server Error\",
      \"Date: Thu, 26 Jan 2017 00:26:06 GMT\",
      \"Server: Apache\",
      \"Origin-Server: AINTREE\",
      \"Vary: Accept-Encoding\",
      \"Content-Length: 1832\",
      \"Connection: close\",
      \"Content-Type: text/html;charset=ISO-8859-1\",
      \"X-URL: /su/commands/getComments\",
      \"X-Host: myhost\",
      \"X-Group: AGILE\",
      \"Cache-Control: max-age=30\",
    },
  },
  objcore[fetch] = 0x7f872c04c4b0 {
    refcnt = 2,
    flags = {busy},
    exp_flags = {},
    boc = 0x7f872c047fe0 {
      refcnt = 2,
      state = req_done,
      vary = (nil),
      stevedore_priv = (nil),
    },
    exp = {1485390366.174212, 30.000000, 1800.000000, 0.000000}
    objhead = 0x7f872c048050,
    stevedore = (nil),
  },
  vcl = {
    name = \"boot\"
    busy = 25
    discard = 0,
    state = auto,
    temp = warm,
    conf = {
      srcname = {
        \"/etc/varnish/default.vcl\",
        \"Builtin\",
        \"/etc/varnish/acl.vcl\",
        \"/etc/varnish/mobile.vcl\",
        \"/etc/varnish/backends.vcl\",
        \"/etc/varnish/sites.vcl\",
        \"/etc/varnish/groups/group-AGILE-config.vcl\",
      },
    },
  },
},
gquintard commented 7 years ago

That's a bug for sure, but I'm guessing your VCL doesn't create any saintmode director, so it doesn't make sense to blacklist.

lifaguiar commented 7 years ago

Thank you @gquintard ! You were absolutely right! See this:

new image_7778 = directors.round_robin(); image_7778.add_backend(image_7778_camus_rbs_com_br_7778); (...)

new agencia_7778 = directors.hash(); agencia_7778.add_backend(agencia_7778_enzo_rbs_com_br_7778, 1); (...)

gquintard commented 7 years ago

Glad I could help :-)

It's still a bug, vmod shouldn't crash Varnish, I'm reopening it.