haproxytech / spoa-mirror

Mirror HTTP requests using the HAProxy SPOP
GNU Lesser General Public License v2.1
40 stars 16 forks source link

spoa-mirror sanity check #26

Closed ripesensor closed 3 years ago

ripesensor commented 3 years ago

I've found that killing the spoa-mirror binary will bring my test suite down; haproxy will no longer serve traffic from the site we'd like to mirror. Is this behavior expected for my config, or for the program in general?

UPDATE: It will return after a long wait (~30s). Hitting a timeout somewhere that is blocking "normal" traffic. Will dig a bit more here.

HAProxy version 2.4.3-4dd5a5a 2021/08/17 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.3.html
Running on: Linux 5.8.0-1041-aws #43~20.04.1-Ubuntu SMP Thu Jul 15 11:07:29 UTC 2021 x86_64
program mirror
    command spoa-mirror --runtime 0 --mirror-url http://mirror:7999 --logfile w:/var/log/spoa-mirror.log

# Frontend Production
listen frontend_production
    bind :7999
    balance roundrobin
    log  global
    # mirror
    filter spoe engine traffic-mirror config /etc/haproxy/mirror.conf
    option http-buffer-request
    # unicorn host
    server localhost localhost:8000 maxconn 96 weight 96 check inter 1000 fall 1 rise 1

# Mirror agents for split-traffic real-world tests
backend spoe-traffic-mirror
    mode tcp
    balance roundrobin
    timeout connect 5s
    timeout server 5s
    server spoa1 127.0.0.1:12345
zaga00 commented 3 years ago

Hello @ripesensor,

there is probably some timeout in the game, I don't know for another reason (except for some bug in the haproxy related to SPOE/SPOP but it's less likely). Of course, a bug in the spoa-mirror program is also possible but this should not affect the operation of the haproxy.

At the moment, I have very little time so I can’t address this problem right away, I’m sorry. In any case I am interested if you discover something new, thank you for reporting the problem.

ripesensor commented 3 years ago

Looks like I was just following the documentation a little too closely? The issue here seems to be not having a check in:

server spoa1 127.0.0.1:12345 check

Which blindly assumed the server was up and running until we hit our server timeout in the frontend.