manicole commented 5 years ago

In Kubernetes, I'd like to create a new metric @type prometheus parsed from Fluentd pod logs.

Expected Behavior

Gather logs from Fluentd pod deployed in namespace opa: kubectl logs <fluentd-pod> -n opa
Parse and edit value via a ConfigMap to constitute metrics. These metrics are formerly displayed at service endpoint, i.e. curl https://<fluentd-service-IP>:<fluentd-service-port>/metrics

Actual Behavior

After deploying Fluentd container in a pod (as a sidecar of OPA), installing fluent-plugin-prometheus in this container, deploying the custom configMap (default endpoint configuration),

curl https://<fluentd-service-IP>:<fluentd-service-port>/metrics shows nothing. I can't find out what I'm missing ...

Steps to Reproduce the Problem

Install Fluentd in a pod as a sidecar container of an OPA container in namespace opa :

    - name: fluentd
      image: fluent/fluentd
      resources:
        limits:
          memory: 200Mi
        requests:
          cpu: 100m
          memory: 200Mi
      env:
      - name: FLUENT_UID
        value: "0"
      volumeMounts:
      - name: varlog
        mountPath: /var/log
      - name: varlibdockercontainers
        mountPath: /var/lib/docker/containers
        readOnly: true
      - name: fluentd-opa
        mountPath: /fluentd/etc/

Install fluent-plugin-prometheus in Fluentd container :

kubectl exec -it <pod-name> -n opa -c fluentd /bin/sh
/  # gem install fluent-plugin-prometheus

Create configMap containing Fluentd configuration in opa namespace

apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-opa
namespace: opa
data:
fluent.conf: |
<source>
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag kubernetes.*
  format json
  read_from_head true
</source>

<filter kubernetes.var.log.containers.**opa**.log>
  @type prometheus
  <metric>
    name resp_status_counter
    type counter
    desc bla
    key $.log.resp_status
 </metric>
</filter>

<match kubernetes.var.log.containers.**opa**.log>
  @type prometheus
</match>

💡 Key is $.log.resp_status because the Fluentd logs to parse (which actually are OPA forwarded logs) are in the form

{
"log":"{
    "client_addr":"10.233.64.1:38700",
    "level":"info",
    "msg":"Sent response.",
    "req_id":12697,
    "req_method":"POST",
    "req_path":"/",
    "resp_body":"{
        "apiVersion":"admission.k8s.io/v1beta1",
        "kind":"AdmissionReview";
        "response":{
            "allowed":true
        }
    }",
    "resp_bytes":94,
    "resp_duration":3.421389,
    "resp_status":200,
    "time":"2019-09-20T13:40:11Z"
}",
"stream":"stderr"
}

Curl Fluentd service endpoint to see supposingly newly created metric : curl https://<fluentd-service-IP>:<fluentd-service-port>/metrics (shows nothing).

Additional Info

Kubernetes v1.14 Fluent-plugin-prometheus v1.6.0 Fluentd v1.3.2 Prometheus-client v0.9.0

Ideas to solve the issue

Could it be a namespace problem ? Fluentd container is deployed in a pod in opa namespace Fluentd service is deployed in opa namespace Fluentd ServiceMonitor (CRD from Prometheus operator) is deployed in monitoring namespace and expose logs from Fluentd service to Prometheus Fluentd ConfigMap is deployed in opa namespace
Could it be a mount problem ? kubectl get configmaps -n opa fluentd-opa -o yaml

gives, among other informations,

    openpolicyagent.org/policy-status: '{"status":"error","error":{"code":"invalid_parameter","message":"error(s)
      occurred while compiling module(s)","errors":[{"code":"rego_parse_error","message":"no
      match found","location":{"file":"opa/fluentd-opa/fluent.conf","row":14,"col":1},"details":{}}]}}'

Any help appreciated ! Thanks

manicole commented 5 years ago

Solutions found

After investigating again and again, here is what I understood (if it can help anyone). If anything is wrong, please point it.

The OPA error displayed when getting the ConfigMap (see above) does not seem to impact : the configuration is well taken in account after deploying the ConfigMap and redeploying the OPA + sidecars.
In Fluentd conf, you usually specify the source within the <source> tags, the logs filtering within the <filter> tags and the output within the ... (there is a trick) ... <match> tags.

Using fluent-plugin-prometheus, tags and types to use change.

Specify the type @type prometheus within the filter tags.
Output the filtered result with a match tag and specify it @type copy ; inside, create <store> tags and specify it @type prometheus.
Output the result to a HTTP address using the ... source tag :/

Here is an example :

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-opa
  namespace: opa
data:
  fluent.conf: |

    # get logs from /var/log/containers/
    <source>
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag kubernetes.*
      format json
      read_from_head true
    </source>

    # filter plugin for prometheus type
    # instrument metrics from records
    # no impact against values of each records
    <filter kubernetes.var.log.containers.**opa**.log>
      @type prometheus
      <metric>
        name log_counter
        type counter
        desc The total number of logs
        key $.log
      </metric>
    </filter>

    # output plugin for prometheus type
    <match kubernetes.var.log.containers.**opa**.log>
      @type copy
      <store>
        @type prometheus
        <metric>
          name log_counter
          type counter
          desc The total number of logs
        </metric>
      </store>
    </match>

    # provides a metrics HTTP endpoint to be scraped by a Prometheus server
    # expose custom and default on container localhost
    <source>
      @type prometheus
      bind 0.0.0.0
      port 24224
      metrics_path /metrics
    </source>

Remaining questions

It does'nt seem to work quite the same depending on the counter/gauge/histogram/summary type of the metric. Within the store tags, when repeating the key argument the metric value is not displayed at the /metrics adress (the value is not displayed, whereas the description is). Using the gauge or histogram types, the key argument is mandatory but I never success in making the value appear ... Is there any documentation available about these, or can anyone provide informations ?
How can I use logs to extract metrics from it ? E.g. I have a response status as a HTTP code and I want to count all non 200 responses to make a ratio of it to the total number of responses. How can I ask Fluentd via its conf to add 1 to counter if response code is not 200 ? Using Ruby ? I tried record_transformer without success :
```
<filter kubernetes.var.log.containers.**opa**.log>
  @type record_transformer
  enable_ruby
  <record>
    log $.log
  </record>
  <record>
    resp_status $.log.resp_status
  </record>
  <record>
    err_percent ${record["resp_status"]=="200"?0:1 / record["log"]}
  </record>
</filter>
```

Thank you !

manicole commented 4 years ago

Further investigations later ... I came up with this configuration, if of any help to anyone :

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-opa
  namespace: opa
data:
  fluent.conf: |

    # Get logs from /var/log/containers/
    <source>
      @type tail
      path /var/log/containers/*.log
      format json
      read_from_head false
      tag kubernetes.*
    </source>

    # Parse log entry to put everything under level "log" on top
    <filter kubernetes.var.log.containers.**opa_opa**.log>
      @type parser
      key_name log
      reserve_data false
      remove_key_name_field true
      ignore_key_not_exist true
      suppress_parse_error_log true
      <parse>
        @type json
      </parse>
    </filter>

    # Keep decision log only
    <filter kubernetes.var.log.containers.**opa_opa**.log>
      @type grep
      <and>
        <regexp>
          key $.req_method
          pattern /POST/
        </regexp>
        <regexp>
          key $.req_path
          pattern /\//
        </regexp>
        <regexp>
          key $.resp_status
          pattern \d+
        </regexp>
      </and>
    </filter>

    # Parse resp_body entry to put everything under level "resp_body" on top
    <filter kubernetes.var.log.containers.**opa_opa**.log>
      @type parser
      key_name resp_body
      reserve_data true
      remove_key_name_field true
      ignore_key_not_exist true
      <parse>
        @type json
      </parse>
    </filter>

    <filter kubernetes.var.log.containers.**opa_opa**.log>
      @type prometheus
      <metric>
        name opa_decisions_total
        type counter
        desc The total number of OPA decisions.
        # No key means increment counter for each record
      </metric>
      <metric>
        name opa_decisions_duration
        type summary
        desc The total number of OPA decisions
        key $.resp_duration
      </metric>
      <labels>
        status $.resp_status
      </labels>
    </filter>

    <match kubernetes.var.log.containers.**opa_opa**.log>
      @type copy
      <store>
        @type prometheus
        <metric>
          name opa_decisions_total
          type counter
          desc The total number of OPA decisions.
          # No key means increment counter for each record
        </metric>
        <metric>
          name opa_decisions_duration
          type summary
          desc The total number of OPA decisions
          key $.resp_duration
        </metric>
      </store>
      <store>
        @type stdout
      </store>
    </match>

    <source>
      @type prometheus
      bind 0.0.0.0
      port 24224
      metrics_path /metrics
    </source>

which gave me the following metrics :

fluent / fluent-plugin-prometheus

[Kubernetes] Create new Prometheus-friendly metric from Fluentd pod logs #117