haproxytech / dataplaneapi

HAProxy Data Plane API
https://www.haproxy.com/documentation/dataplaneapi/
Apache License 2.0
321 stars 75 forks source link

Empty HAPROXY_CFGFILES environment variable causing managed dataplanapi process to exit #318

Open TonyNguyenAdNet opened 9 months ago

TonyNguyenAdNet commented 9 months ago

Hi Everyone, hoping for some help!

High level goal:

Configure HAProxy via split config files, using -f conf.d/, similar to OS service configurations *.d/ pattern, while using dataplaneapi (latest 2.8 version) for basic configuration such as draining/enabling a backend server via remote scripting [1]. We are currently running into the issue where dataplaneapi invoked via program/command errors out due to what it believes to be a missing configuration file:

time="2023-11-25T17:19:21-08:00" level=fatal msg="The configuration file is not declared in the HAPROXY_CFGFILES environment variable, cannot start."

... however the initial HAProxy config file is specified within the .yml file as so:

haproxy:
  config_file: /f/haproxy-h3-9200/conf.d/00-haproxy.cfg

We have also tried the command-line argument, which results in the same error: --config-file /f/haproxy-h3-9200/conf.d/00-haproxy.cfg

Note that even though we are specifying the first/primary config file (00-haproxy.cfg) to dataplaneapi, when starting dataplaneapi independently of HAProxy (ie, not using program+command), the desired management functionality works for the backend specified in the second file (01-haproxy.cfg). It's just when using HAProxy to manage the dataplaneapi process do we see the above error.

Doing some testing, it appears that HAProxy is not setting a value for this environment variable when spawning dataplaneapi via the program directive. As a simple test, I replaced dataplanapi with a simple echo command as shown in the below examples. I'm testing both our current HAProxy production version 2.2.14 as well as the latest 2.8.4, and as the output shows, the value of HAPROXY_CFGFILES is empty when using the echo command. Interestingly, switching to printenv shows this variable being set! So perhaps there's some subtle behavior of the current sub-process creation that is not passing variables with certain sub-commands?

Configuration Overview:

conf.d/00-haproxy.cfg and conf.d/01-haproxy.cfg are the contents of a previously working single configuration file, with 00-haproxy.cfg containing the defaults/global sections and 01-haproxy.cfg containing a single listen directive. Setting aside dataplaneapi, the HAProxy functionality itself works fine with this configuration: the single listen configuration forwards incoming requests to the appropriate backend servers as expected.

program config:

command echo "HAPROXY_CFGFILES: ${HAPROXY_CFGFILES}, PATH: ${PATH}, HAPROXY_SERVER_NAME: ${HAPROXY_SERVER_NAME}, HAPROXY_LOCALPEER: ${HAPROXY_LOCALPEER}"

With 2.2.14:

HAProxy invoke command1 (multiple config files specified):

/f/haproxy-9200/bin/haproxy-9200 -Ws -f /f/haproxy-9200/conf.d/00-haproxy.cfg -f /f/haproxy-9200/conf.d/01-haproxy.cfg -p /run/haproxy-9200.pid -S /run/haproxy-9200-master.sock

HAProxy invoke command2 (config directory specified):

/f/haproxy-9200/bin/haproxy-9200 -Ws -f /f/haproxy-9200/conf.d -p /run/haproxy-9200.pid -S /run/haproxy-9200-master.sock

Output is the same for both invocations:

HAPROXY_CFGFILES: , PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/f/haproxy-9200/bin:/f/haproxy-9201/bin:/f/haproxy-adserver/bin:/opt/puppetlabs/bin:/root/bin, HAPROXY_SERVER_NAME: , HAPROXY_LOCALPEER: haproxy.my.server

With 2.8.4:

HAProxy invoke command1 (multiple config files specified):

/f/haproxy-h3-9200/bin/haproxy-h3-9200 -Ws -f /f/haproxy-h3-9200/conf.d/00-haproxy.cfg -f /f/haproxy-h3-9200/conf.d/01-haproxy.cfg -p /run/haproxy-h3-9200.pid -S /run/haproxy-h3-9200-master.sock

HAProxy invoke command2 (config directory specified):

/f/haproxy-h3-9200/bin/haproxy-h3-9200 -Ws -f /f/haproxy-h3-9200/conf.d -p /run/haproxy-h3-9200.pid -S /run/haproxy-h3-9200-master.sock

Output is the same for both invocations:

HAPROXY_CFGFILES: , PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/f/haproxy-h3-9200/bin:/root/bin, HAPROXY_SERVER_NAME: , HAPROXY_LOCALPEER: haproxy.my.server

Finally, with command printenv I see the following output on both HAProxy versions (the variable is set):

LS_COLORS=<redacted>
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
HOSTNAME=<redacted>
XDG_SESSION_ID=762
USER=root
PWD=/f/haproxy-9200
HOME=/root
SSH_CLIENT=<redacted>
SSH_TTY=/dev/pts/0
MAIL=/var/spool/mail/root
TERM=xterm-256color
SHELL=/bin/bash
SHLVL=1
LOGNAME=root
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/0/bus
XDG_RUNTIME_DIR=/run/user/0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/f/haproxy-9200/bin:/f/haproxy-9201/bin:/f/haproxy-adserver/bin:/opt/puppetlabs/bin:/root/bin
HISTSIZE=1000
LESSOPEN=||/usr/bin/lesspipe.sh %s
_=/f/haproxy-9200/bin/haproxy-9200
OLDPWD=/root
HAPROXY_LOCALPEER=<redacted>
HAPROXY_CFGFILES=/f/haproxy-h3-9200/conf.d/00-haproxy.cfg;/f/haproxy-h3-9200/conf.d/01-haproxy.cfg                              
HAPROXY_MWORKER=1
HAPROXY_CLI=unix@/f/haproxy-9200/haproxy-9200.sock;sockpair@7
HAPROXY_MASTER_CLI=unix@/run/haproxy-9200-master.sock

Any insight here is appreciated! We are very close to having our preferred configuration setup that is a balance between previous management styles and utilizing the latest dataplaneapi functionality.

[1] it is understood that the developers of dataplaneapi want to move administration completely under dataplanapi's control, ie, re-writing configuration files; however our production practices need to move in a slower fashion utilizing our current Ansible/Puppet management style which relies on these tools to manage config files. We will ultimately embrace dataplaneapi for management completely but now is not that time.

KiyoIchikawa commented 1 month ago

@TonyNguyenAdNet, we recently experienced the same issues running HAProxy v2.8.5 and Dataplaneapi v2.8.7 on OEL9. It looks like adding --disable-inotify to the command in the program section of your HAProxy config might resolve the issue. So, ours looks something like this:

program api
  command dataplaneapi -f /etc/haproxy/dataplaneapi.yml --disable-inotify
  no option start-on-reload

We no longer get the has at the top of the HAProxy config file and it did not re-arrange anything.

EDIT: This can be added as a config option in the yml as disable_inotify: true

tommyjcarpenter commented 1 month ago

I have this same issue. The --disable-inotify solution above did not work.

I'm on haproxy 3.0.3

Im following the exact instructions from here in a Kubernetes pod: https://www.haproxy.com/documentation/haproxy-data-plane-api/installation/install-on-haproxy/#run-the-api-in-the-current-terminal-session

I'm mounting in a file:


---
apiVersion: v1
kind: ConfigMap
metadata:
  name: dataplane-conf
  annotations:
    argocd.argoproj.io/sync-wave: '-1'
data:
  dataplaneapi.yaml: |
    dataplaneapi:
      disable_inotify: true
      host: 0.0.0.0
      port: 5555
      transaction:
        transaction_dir: /tmp/haproxy
      user:
      - insecure: false
        password: xxx
        name: admin
    haproxy:
      config_file: /usr/local/etc/haproxy/haproxy.cfg
      haproxy_bin: /usr/sbin/haproxy
      reload:
        reload_delay: 5
        reload_cmd: service haproxy reload
        restart_cmd: service haproxy restart

and in the main haproxy.cfg I have:

program api
      command /opt/bitnami/haproxy-dataplaneapi/bin/dataplaneapi -f /usr/local/etc/haproxy-dataplane/dataplaneapi.yaml --disable-inotify

but:

[NOTICE]   (1) : New program 'api' (8) forked
[NOTICE]   (1) : New worker (9) forked
[NOTICE]   (1) : Loading success.
time="2024-07-30T11:10:51Z" level=fatal msg="The configuration file is not declared in the HAPROXY_CFGFILES environment variable, cannot start."
[NOTICE]   (1) : haproxy version is 3.0.3-95a607c
[NOTICE]   (1) : path to executable is /opt/bitnami/haproxy/sbin/haproxy
[ALERT]    (1) : Current program 'api' (8) exited with code 1 (Exit)
[ALERT]    (1) : exit-on-failure: killing every processes with SIGTERM
[ALERT]    (1) : Current worker (9) exited with code 143 (Terminated)
[WARNING]  (1) : All workers exited. Exiting... (1)

I'm going to try setting this ENV but like the posters said above,

haproxy:
      config_file: /usr/local/etc/haproxy/haproxy.cfg

seems to just not work.

KiyoIchikawa commented 1 month ago

@tommyjcarpenter my situation involved setting dataplaneapi back to watching the actual HAProxy configuration file, as well as setting the disable_inotify setting to true.

tommyjcarpenter commented 1 month ago

@KiyoIchikawa can you please elaborate on the first part? I tried the disable_inotify and it didnt work.

im doing this in kubernetes, which there are no instructions for, but I thought it should be the same using the program/master worker solution above.

Im currently giving a shot at the sidecar approach instead.

tommyjcarpenter commented 1 month ago

So, I've realized only now that while my sidecar loads and does reads fine, it cant restart the haproxy process, because its in a separate container 😓 this should have been obvious... 20/20 hindsight.

So i'd like to circle back here, because I have tried a million ways to solve this and I keep hitting this issue.

Im trying to run the dataplaneapi in the same container as haproxy.

In my haproxy config, I have


    program echo1
      command cat /etc/haproxy/dataplaneapi.yaml

    program echo2
      command cat /etc/haproxy/haproxy.cfg

    program api
      command dataplaneapi -f /etc/haproxy/dataplaneapi.yaml
      no option start-on-reload

both of those echo (for debugging) print out fine.

my dataplaneapi.yaml looks like:

    dataplaneapi:
      log_level: debug
      host: 0.0.0.0
      port: 5555
      transaction:
        transaction_dir: /tmp/haproxy
      user:
      - insecure: false
        password: $5$qyiFlUI2hW4JRlpv$gPAZo98q.t0ItrYZ6wyFJxfpw0n6k2VUcBUVgwt3Vj1
        name: admin
    haproxy:
      haproxy_bin: /opt/bitnami/haproxy/sbin/haproxy
      # config_file is the default 
      reload:
        reload_delay: 5
        reload_cmd: service haproxy reload
        restart_cmd: service haproxy restart

I have tried /usr/etc/local/haproxy etc, Ive tried specifying the config (im using the default now), Ive tried explicitly setting HAPROXY_CFGFILES=... but all to no avail.

[NOTICE]   (1) : haproxy version is 3.0.3-95a607c
[NOTICE]   (1) : path to executable is /opt/bitnami/haproxy/sbin/haproxy
[ALERT]    (1) : Current program 'echo1' (8) exited with code 0 (Exit)
[ALERT]    (1) : Current program 'echo2' (9) exited with code 0 (Exit)
time="2024-07-31T19:36:57Z" level=fatal msg="The configuration file is not declared in the HAPROXY_CFGFILES environment variable, cannot start."
[ALERT]    (1) : Current program 'api' (10) exited with code 1 (Exit)
[ALERT]    (1) : exit-on-failure: killing every processes with SIGTERM
[ALERT]    (1) : Current worker (11) exited with code 143 (Terminated)
[WARNING]  (1) : All workers exited. Exiting... (1)
D1StrX commented 1 month ago

Maybe related, but HAProxy is mixing things up; When executing dataplaneapi version I get;

configuration file /etc/haproxy/dataplaneapi.yaml does not exists, creating one
configuration error: the custom reload strategy requires these options to be set: ReloadCmd, RestartCmd 

and /etc/haproxy/dataplaneapi.yaml never gets created.

Thing is, dataplaneapi config is stored in /etc/haproxy/dataplaneapi.yml , not /etc/haproxy/dataplaneapi.yaml And should be stored in /etc/haproxy/dataplaneapi.yml according to their own documentation