fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.86k stars 1.59k forks source link

YAML configuration should be idiomatic #7593

Open stevehipwell opened 1 year ago

stevehipwell commented 1 year ago

Is your feature request related to a problem? Please describe. I'd like the Fluent Bit YAML configuration to be idiomatic as otherwise it's just another DSL.

Describe the solution you'd like In addition to the YAML format needing to support all of the potential configurations I'd like the following points addresses.

  1. Make sure the YAML configuration is valid YAML (potentially StrictYAML)
  2. Use the correct types (boolean types should only accept valid YAML booleans)
  3. Use idiomatic camelcase keys

This is an example configuration using idiomatic YAML (there could be more nesting and the keys could be optimised e.g. parsersFile -> parsers).

env: []

service:
  daemon: false
  httpServer: true
  httpPort: 2020
  httpListen: 0.0.0.0

  parsersFile:
    - /fluent-bit/etc/parsers.conf
    - /fluent-bit/etc/extra-parsers.conf

  storage:
    path: /fluent-bit/data

inputs:
  - name: forward
    listen: 0.0.0.0
    port: 24224
    storage:
      type: filesystem

filters:
  - name: record_modifier
    match: syslog
    record:
      - powered_by calyptia

outputs:
  - name: stdout
    match: *

Describe alternatives you've considered We don't currently use YAML config due to this and use the classic version.

Additional context Having an idiomatic YAML configuration format will make Helm chart and general Kubernetes integration much simpler and safer.

edsiper commented 1 year ago

hi @stevehipwell , thanks for raising this issue.

Excuse my ignorance, but I definitely want to learn more about what are the technical blockers around idiomatic v/s no-idiomatic. Is this just something around Helm and Kubernetes, or is this also a blocker in another type of environments that uses Yaml as a centric config format ?

just to double check, if somebody writes HTTPListen is something invalid where httpListen is the right one ?

stevehipwell commented 1 year ago

@edsiper YAML is case sensitive and idiomatically uses camelcase keys. If you're advertising YAML then it needs to actually be YAML and not a DSL which looks like YAML but isn't (the uncanny valley isn't a great place to be).

pwhelan commented 1 year ago

Adding support for property keys in camelCase alongside snake_case should be more than doable. Making it strictly compliant not so much.

At the moment the YAML parser uses an event based API which constructs the internal configuration structures using the same API as the classic format. Validation is done when finally invoking the engine so enforcing the use of the specific boolean values 'true' and 'false' would require quite a bit of work.

stevehipwell commented 1 year ago

@pwhelan supporting and only documenting idiomatic YAML would be "good enough".

However is there no way of running YAML validation on the config and only triggering the next stage if it's valid?

pwhelan commented 1 year ago

However is there no way of running YAML validation on the config and only triggering the next stage if it's valid?

At the moment not really, at least not at that level. We do some checking of the syntax, it has to be valid YAML syntax and it has to have the general structure we are looking for, but checking the types for properties would require checking against the config_maps for each plugin. Without that step there is no way to differentiate between a string or a boolean for each plugin property.

patrick-stephens commented 1 year ago

One other thing to capture is some of the other config files (i.e. parser configuration) also need to be provided for YAML format.

edsiper commented 1 year ago

FYI: #7879 is coming with the camelCase solution

stevehipwell commented 1 year ago

@edsiper did you add support for nesting keys like storage.metrics and setting duplicate keys as a list (see below)?

service:
  parsersFile:
    - /fluent-bit/etc/parsers.conf
    - /fluent-bit/etc/conf/custom-parsers.conf
  storage:
    metrics: true
stevehipwell commented 7 months ago

@edsiper @patrick-stephens this issue needs reopening as even the camel case to snake case support is broken. The below config is based on the smoke test with minimal changes to just the keys and boolean values (which are the only bit to work). I've tested it against the standard OCI image for v2 & v3.

---
# Translated according to https://github.com/fluent/fluent-bit/pull/4621
service:
    httpServer: true
    healthCheck: true
    logLevel: debug

pipeline:
    inputs:
        - name: random
          tag: test
          samples: 10

    filters:
        - name: lua
          match: test
          call: append_tag
          code: |
              function append_tag(tag, timestamp, record)
                 new_record = record
                 new_record["tag"] = tag
                 return 1, timestamp, new_record
              end

        - name: expect
          match: test
          keyExists: tag
          keyValEq: tag test
          action: exit

    outputs:
        - name: stdout
          match: test

Also none of the YAML native formatting works such as env: [] and nested objects.

Plus there is currently no support for configuring parsers in YAML according to the docs.

patrick-stephens commented 7 months ago

I think @pwhelan was looking at the parser support

stevehipwell commented 7 months ago

I'm not sure what happened with #7879?

github-actions[bot] commented 4 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

stevehipwell commented 4 months ago

@patrick-stephens could you remove the stale label?

stevehipwell commented 1 day ago

@edsiper bumping this as the 3.2.0 release claims "Complete YAML Support" which while technically true isn't much use when the idioms are so far away from the community idioms expected in the Kubernetes space. FYI the current Helm charts have "Complete YAML Support" for the FB classic configuration, because YAML can pretty much represent anything; it's the idioms which add the value and reduce cognitive load.

  1. Make sure the YAML configuration is valid YAML (potentially StrictYAML)
  2. Use the correct types (boolean types should only accept valid YAML booleans)
  3. Use idiomatic camelcase keys

The 3 items in the original issue above are still not resolved and I'd suggest that a fourth item should be added to include a JSON schema for the configuration to check for correctness.