jcmoraisjr / haproxy-ingress

HAProxy Ingress
https://haproxy-ingress.github.io
Apache License 2.0
1.02k stars 272 forks source link

Logging #110

Open dobesv opened 6 years ago

dobesv commented 6 years ago

There's some documentation on how to tell this to log to syslog, but that isn't the way logging seems to be done normally on kubernetes. Instead you log to stdout / stderr and it is collected by the logging handler and passed along to your log collector of choice (e.g. fluentd).

I did try to tell this to send logs to fluentd using the syslog protocol without much success, I'm not sure where I messed up the configuration.

I'm not a fan of that approach because it is not as reliable - the kubernetes log collector will buffer logs to a file if fluentd isn't running or falls behind, but fluentd listening for syslog messages will not.

What are you using to collect access logs?

I'd also like to be able to include some headers in the access logs, such as user-agent and some custom app headers, and format it all in JSON for easy parsing by fluentd.

jcmoraisjr commented 6 years ago

Hi, I'm using a Graylog input to collect HAProxy logs. Note that HAProxy uses UDP.

HAProxy process is chrooted before start the event loop, because of that it doesn't have permission to log to stdout and a syslog endpoint is used instead.

A JSON log format would be very nice but I didn't play with yet.

dobesv commented 6 years ago

Okay. I did eventually get the logs into fluentd. Perhaps I'll try and post the configuration here when I get back to it.

On Sat, Mar 10, 2018, 11:53 AM Joao Morais notifications@github.com wrote:

Hi, I'm using a Graylog input to collect HAProxy logs. Note that HAProxy uses UDP.

HAProxy process is chrooted before start the event loop, because of that it doesn't have permission to log to stdout and a syslog endpoint is used instead.

A JSON log format would be very nice but I didn't play with yet.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jcmoraisjr/haproxy-ingress/issues/110#issuecomment-372060557, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUAmTX9RqiKSc33tWYOMEuojpP9TP1Rks5tdC8lgaJpZM4SjmPI .

zapman449 commented 6 years ago

we strapped an rsyslog sidecar to the ingress pod, and have that logging to stdout.

dobesv commented 6 years ago

Here's the fluentd config I used:

  haproxy-syslog.input.conf: |-
    # Receive logs from HAProxy over syslog protocol
    <source>
      @id haproxy
      @type syslog
      port 5140
      bind 0.0.0.0
      tag haproxy
      <parse>
        @type syslog
        message_format rfc5424
        rfc5424_time_format %Y-%m-%dT%H:%M:%S%z
      </parse>
    </source>
    <filter haproxy.**>
      @type parser
      key_name message
      reserve_data true
      reserve_time true
      emit_invalid_record_to_error false
      <parse>
        @type multi_format
        <pattern>
          format regexp
          # Examples:
          # 64.180.112.143:51066 [09/Mar/2018:19:28:09.404] httpfront-staging.goformative.com~ formative-meteor-80/server0001 0/0/0/1/1 200 368 - - --NI 83/41/0/1/0 0/0 "GET /sockjs/info HTTP/1.1"
          # 64.180.112.143:58523 [09/Mar/2018:21:15:46.366] httpfront-dashboard.staging.kops.goformative.com~ kube-system-kubernetes-dashboard-80/server0000 0/0/0/1/1 200 178 - - ---- 3/1/0/1/0 0/0 "GET /api/appConfig.json HTTP/1.1"
          expression /^(?<remoteAddress>[\w\.]+:\d+) \[(?<requestDate>[^\]]*)\] httpfront-(?<domain>[\w-.]+)~ (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)(-[\d]+)?\/[\w-]+ (?<requestReadTime>\d+)\/(?<waitTime>\d+)\/(?<backendConnectTime>\d+)\/(?<backendResponseTime>\d+)\/(?<responseTime>\d+) (?<statusCode>\d+) (?<responseBytes>\d+) (?<reqCookie>[\w-]+) (?<resCookie>[\w-]+) (?<terminationState>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srvconn>\d+)\/(?<retries>\d+) (?<srvqueue>\d+)\/(?<backendQueue>\d+) "(?<method>[A-Z]+) (?<url>[^ ]+) (?<httpVersion>[^ ]+)"$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Connect from 172.20.59.142:13201 to 172.20.59.142:31916 (httpfront/HTTP)
          # Connect from 10.0.1.2:33312 to 10.0.3.31:8012 (www/HTTP)
          expression /^Connect from (?<remoteAddress>[\w\.]+:\d+) to (?<backendAddress>[\w\.]+:\d+) \((?<frontend>[\w]+)\/(?<mode>[\w]+)\)$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Server kube-system-fluentd-http-http-input/server0002 is going DOWN for maintenance. 3 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          # Server kube-system-fluentd-http-http-input/server0001 is going DOWN for maintenance. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          expression /^Server (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)\/[\w-]+ is going DOWN for maintenance. (?<remainingActive>\d+) active and (?<remainingBackup>\d+) backup servers left. (?<activeSessions>\d+) sessions active, (?<requeued>\d+) requeued, (?<remainingInQueue>\d+) remaining in queue.$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Server kube-system-fluentd-http-http-input/server0003 is DOWN, reason: Layer4 connection problem, info: \"Connection refused\", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          # Server kube-system-fluentd-http-http-input/server0003 is UP, reason: Layer4 check passed, check duration: 0ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
          expression /^Server (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)\/[\w-]+ is (?<status>[\w]+), reason: (?<reason>[^,]+), (info: "(?<info>[^"]+)", )?check duration: (?<checkDuration>[^.]+). (?<remainingActive>\d+) active and (?<remainingBackup>\d+) backup servers (left|online). ((?<activeSessions>\d+) sessions active, )?(?<requeued>\d+) (sessions )?requeued, (?<remainingInQueue>\d+) (remaining|total) in queue.$/
        </pattern>
      </parse>
    </filter>

Then you have to expose that UDP port, like:

        - name: haproxy-syslog
          containerPort: 5140
          protocol: UDP

And create a service:

apiVersion: v1
kind: Service
metadata:
  name: haproxy-syslog
  namespace: kube-system
spec:
  ports:
  - name: haproxy-syslog
    port: 5140
    protocol: UDP
    targetPort: haproxy-syslog
  selector:
    k8s-app: fluentd-es

And point haproxy at it in the config map:

  syslog-endpoint: haproxy-syslog.kube-system:5140

Not sure if this should be dumped into the docs somewhere.

jcmoraisjr commented 6 years ago

I'll leave this open in order to remember to update the examples doc. Thanks for sharing.

styk-tv commented 6 years ago

@dobesv thanks for posting fluentd example. worth opening a separate doc chapter for that.

itsmedivs commented 6 years ago

@jcmoraisjr Can you please document how you used Graylog to collect HAProxy logs

jcmoraisjr commented 6 years ago

Hi, it's pretty straighforward: create a syslog udp input, choose a port, start the input and use ip(graylog host):port(from input) on syslog-endpoint.

shubhasish commented 5 years ago

Access-logs Logging can also be achieved using a netcat sidecar container. Make the sidecar container to listen on 514 port and then configure syslog-endpoint as localhost:514. Now the access logs can be seen using command kubectl logs <ha-proxy ingress pod> access-los

Here are my configurations: Sidecar

      - name: access-log
        image: appropriate/nc
        ports:
        - name: udp
          containerPort: 514
          protocol: UDP
        args: ["-ul","514"]

Configmap


kind: ConfigMap
metadata:
   name: haproxy-configmap
   namespace: ingress-controller
data:
   syslog-endpoint: "localhost:514"```
dcowden commented 5 years ago

@shubhasish Thanks for your tip! I'm using your approach to try to get logs-- i've got something but its not the right output. Would you mind sharing your configuration?

I have a sidecar defined like this:

      - name: access-log
        image: appropriate/nc
        ports:
        - name: udp
          containerPort: 514
          protocol: UDP
        args: ["-ul","514"]  

And i have my haproxy-ingress configmap like this:

apiVersion: v1
kind: ConfigMap
metadata:
  name: haproxy-ingress
  namespace: ops
data:
  balance-algorithm: "leastconn"
  drain-support: "true"
  use-proxy-protcol: "true"
  proxy-body-size: "100m"
  ssl-redirect: "true"
  timeout-connect: "5s"
  timeout-http-request: "300s"
  timeout-queue: "300s"
  timeout-server: "300s"
  timeout-server-fin: "300s"
  syslog-endpoint: "localhost:514"
  http-log-format: default
  https-log-format: default

With this configuration, i get output like this:

[ingress-lb-9vk6v] <134>1 2018-07-13T13:13:45+00:00 ip-172-25-83-6.colinx.com ingress 21 - - default It doesn't appear to have the fields I am expecting.

I'd love to see a working configuration I could use-- that'd be really helpful!

dcowden commented 5 years ago

update: i can see the logs, but I'm finding that they appear only for a short while-- then they stop working.

shubhasish commented 5 years ago

@dcowden this never happened with me. And we can change the log-format using key http-log-format & https-log-format

dcowden commented 5 years ago

This ended up having to do with using g the netcat container. When I changed to using an rsyalig container the problem disappeared.

git001 commented 5 years ago

I use for this http://smarden.org/socklog/ in my Dockerfile https://gitlab.com/aleks001/haproxy18-centos/blob/master/Dockerfile#L31-36 and configure haproxy to log there.

  && make -C /usr/src/lua linux test install \
  && mkdir /package && cd /package \
  && curl -sSO http://smarden.org/socklog/socklog-2.1.0.tar.gz \
  && tar xfvz socklog-2.1.0.tar.gz \
  && rm socklog-2.1.0.tar.gz \
  && cd admin/socklog-2.1.0 \
  && package/install \

https://gitlab.com/aleks001/haproxy18-centos/blob/master/containerfiles/container-entrypoint.sh#L92-93

echo "starting socklog"
/usr/local/bin/socklog unix /tmp/haproxy_syslog &

This makes the logs command usable for k8s and OpenShift and it prevents the requirement to install rsyslog with all the dependencies.

AlexProfi commented 5 years ago

Hello I have tis config for fluentd ( to collect logs from haproxy-ingress)

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: kube-system
data:
  haproxy-syslog.input.conf: |-
    # Receive logs from HAProxy over syslog protocol
    <source>
      @id haproxy
      @type syslog
      port 5140
      bind 0.0.0.0
      tag haproxy
      <parse>
        @type syslog
        message_format rfc5424
        rfc5424_time_format %Y-%m-%dT%H:%M:%S%z
      </parse>
    </source>
    <filter haproxy.**>
      @type parser
      key_name message
      reserve_data true
      reserve_time true
      emit_invalid_record_to_error false
      <parse>
        @type multi_format
        <pattern>
          format regexp
          # Examples:
          # 64.180.112.143:51066 [09/Mar/2018:19:28:09.404] httpfront-staging.goformative.com~ formative-meteor-80/server0001 0/0/0/1/1 200 368 - - --NI 83/41/0/1/0 0/0 "GET /sockjs/info HTTP/1.1"
          # 64.180.112.143:58523 [09/Mar/2018:21:15:46.366] httpfront-dashboard.staging.kops.goformative.com~ kube-system-kubernetes-dashboard-80/server0000 0/0/0/1/1 200 178 - - ---- 3/1/0/1/0 0/0 "GET /api/appConfig.json HTTP/1.1"
          expression /^(?<remoteAddress>[\w\.]+:\d+) \[(?<requestDate>[^\]]*)\] httpfront-(?<domain>[\w-.]+)~ (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)(-[\d]+)?\/[\w-]+ (?<requestReadTime>\d+)\/(?<waitTime>\d+)\/(?<backendConnectTime>\d+)\/(?<backendResponseTime>\d+)\/(?<responseTime>\d+) (?<statusCode>\d+) (?<responseBytes>\d+) (?<reqCookie>[\w-]+) (?<resCookie>[\w-]+) (?<terminationState>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srvconn>\d+)\/(?<retries>\d+) (?<srvqueue>\d+)\/(?<backendQueue>\d+) "(?<method>[A-Z]+) (?<url>[^ ]+) (?<httpVersion>[^ ]+)"$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Connect from 172.20.59.142:13201 to 172.20.59.142:31916 (httpfront/HTTP)
          # Connect from 10.0.1.2:33312 to 10.0.3.31:8012 (www/HTTP)
          expression /^Connect from (?<remoteAddress>[\w\.]+:\d+) to (?<backendAddress>[\w\.]+:\d+) \((?<frontend>[\w]+)\/(?<mode>[\w]+)\)$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Server kube-system-fluentd-http-http-input/server0002 is going DOWN for maintenance. 3 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          # Server kube-system-fluentd-http-http-input/server0001 is going DOWN for maintenance. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          expression /^Server (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)\/[\w-]+ is going DOWN for maintenance. (?<remainingActive>\d+) active and (?<remainingBackup>\d+) backup servers left. (?<activeSessions>\d+) sessions active, (?<requeued>\d+) requeued, (?<remainingInQueue>\d+) remaining in queue.$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Server kube-system-fluentd-http-http-input/server0003 is DOWN, reason: Layer4 connection problem, info: \"Connection refused\", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          # Server kube-system-fluentd-http-http-input/server0003 is UP, reason: Layer4 check passed, check duration: 0ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
          expression /^Server (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)\/[\w-]+ is (?<status>[\w]+), reason: (?<reason>[^,]+), (info: "(?<info>[^"]+)", )?check duration: (?<checkDuration>[^.]+). (?<remainingActive>\d+) active and (?<remainingBackup>\d+) backup servers (left|online). ((?<activeSessions>\d+) sessions active, )?(?<requeued>\d+) (sessions )?requeued, (?<remainingInQueue>\d+) (remaining|total) in queue.$/
        </pattern>
      </parse>
    </filter>

---

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      hostNetwork: true
      nodeSelector:
        role: edge-router
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        ports:
        - name: haproxy-syslog
          containerPort: 5140
          protocol: UDP
        image: fluent/fluentd-kubernetes-daemonset:v1.3-debian-syslog
        env:
          - name:  SYSLOG_HOST
            value: "sysloghost"
          - name:  SYSLOG_PORT
            value: "514"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi

        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true

      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

---

apiVersion: v1
kind: Service
metadata:
  name: haproxy-syslog
  namespace: kube-system
spec:
  ports:
  - name: haproxy-syslog
    port: 5140
    protocol: UDP
    targetPort: haproxy-syslog
  selector:
    k8s-app: fluentd-logging

And the error is:

[info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2019-01-25 15:38:23 +0000 [error]: config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ConfigError error="Invalid Kubernetes API v1 endpoint https://10.96.0.1:443/api:
 Unauthorized"

How to fix?

dobesv commented 5 years ago

Your error isn't really related to haproxy, it's your fluentd config that is messed up.

AlexProfi commented 5 years ago

I know it. I ask man that post configs for fluentd

dobesv commented 4 years ago

Here's updated regexp with a little fix:

    <pattern>
      format regexp
      expression /^(?<remoteAddress>[\w\.]+:\d+) \[(?<requestDate>[^\]]*)\] (?<frontend>[\w.-]+)~ (?<backend>(?:(?<namespace>[^\/_]+)_)?(?<service>[^\/_]+)?(?:_(?<servicePort>[^\/_]+))?|[^\/ ]*)(?:\/(?<slot>[^ ]+)) (?<requestReadTime>\d+|-1)\/(?<waitTime>\d+|-1)\/(?<backendConnectTime>\d+|-1)\/(?<backendResponseTime>\d+|-1)\/(?<responseTime>\d+|-1) (?<statusCode>\d+|-1) (?<responseBytes>\d+|-1) (?<reqCookie>[^ ]+) (?<resCookie>[^ ]+) (?<terminationState>[^ ]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srvconn>\d+)\/(?<retries>\d+) (?<srvqueue>\d+)\/(?<backendQueue>\d+) "(?<method>[A-Z]+) (?<url>[^ ]+) (?<httpVersion>[^ ]+)"$/
      types requestReadTime:integer,waitTime:integer,backendConnectTime:integer,backendResponseTime:integer,responseTime:integer,statusCode:integer,responseBytes:integer,actconn:integer,feconn:integer,beconn:integer,srvconn:integer,retries:integer,srvqueue:integer,backendQueue:integer
    </pattern>
LeeHampton commented 3 years ago

Could someone provide me the simplest way to get meaningful HA Proxy debug logging from this controller without external dependencies like fluentd? I have a simple POC I want to test without having to spin up a full observability stack.

dobesv commented 3 years ago

@LeeHampton Maybe https://github.com/jcmoraisjr/haproxy-ingress/issues/110#issuecomment-372081711 would be the way to go.

marjorg commented 3 years ago

@jcmoraisjr @dobesv Would this solution output the same info as providing controller.logs.enabled=true to Helm install/upgrade?

dxlr8r commented 3 months ago

@dcowden This ended up having to do with using g the netcat container. When I changed to using an rsyalig container the problem disappeared.

I had the same issue. But I have been using netcat for syslog -> stdout before without issues, so I investigated further. First, nc needs the -k flag, to keep listening, but even then most entries where dropped. So I figured it was the netcat in the image itself, and when I replaced it with a more current image it worked.

Here is what I did.

  1. add to your Helm values.yaml under controller.config: syslog-endpoint: "127.0.0.1:514"
  2. create postrender.sh
#!/bin/sh
export chart=$(cat)
export access_log_sc='{"name":"access-log","image":"chainguard/netcat","ports":[{"name":"udp","containerPort":514,"protocol":"UDP"}],"args":["-u", "-l","514", "-k"]}'
echo "$chart" | yq '. | select(.kind == "DaemonSet").spec.template.spec.containers[.containers | length] |= env(access_log_sc)' -
  1. make it executable chmod +x postrender.sh
  2. install yq if you do not have it
  3. when using helm install or helm upgrade --install add --post-renderer ./postrender.sh

I find this much better than the official sidecar that require root (uid 0) to run.

jcmoraisjr commented 1 month ago

@dobesv just added a few lines of doc regarding options to log to stdout, see #1138 . This can be done either placing haproxy as a sidecar, or configuring haproxy to log along with controller on a single container topology. No need for a special syslog sidecar anymore. Can we consider this issue fixed, or maybe do you have any improvement suggestion for the doc?

dobesv commented 1 month ago

Yes I think that addresses it.

Does that mean the controller is receiving the syslog messages and then printing them to stdout?

jcmoraisjr commented 4 weeks ago

When running in master-worker mode, we have a chance to configure the haproxy process to send logs to the same output the controller does, so in the end both controller and haproxy shares stdout and stderr of the controller container.