Support out-of-band image updates for fluentd and fluentbit

fluent / fluent-operator

Operate Fluent Bit and Fluentd in the Kubernetes way - Previously known as FluentBit Operator

Apache License 2.0

578 stars 246 forks source link

Support out-of-band image updates for fluentd and fluentbit #1241

Closed joshuabaird closed 2 months ago

joshuabaird commented 2 months ago

Is your feature request related to a problem? Please describe.

We currently build and release fluentbit and fluentd images which track their upstream versions. For example, our ghcr.io/fluent/fluent-operator/fluent-bit:v3.10 image is based off of the fluent/fluent-bit:3.1.2 image.

Our images often contain customizations, etc -- for example, our fluentd image installs several fluentd plugins.

Presently, if we want to make changes to these custom artifacts (eg, update a plugin), we have to build a new image which overrides an existing tag. This is troublesome for a few reasons -- mainly, our users don't expect potentially breaking changes between images when they are pinned to a specific tag (eg, ghcr.io/fluent/fluent-operator/fluent-bit:v3.10).

Describe the solution you'd like

We need a way to build and release new images without overwriting our existing tags. One option would be to introduce a "patch version" in our versioning -- eg fluent-bit:v3.10.1). In this case, the image would still be based off of fluent/fluent-bit:3.10, but would include any custom changes (eg, plugin upgrades) that we wanted to include in the image.

@benjaminhuo @wenchajun Do you have any suggestions or opinions on this?

Additional context

No response

benjaminhuo commented 2 months ago

I agree. A few comments:

ghcr.io/fluent/fluent-operator/fluent-bit:v3.1.2 (not v3.10) is based off fluent/fluent-bit:3.1.2
let's say fluentbit has v3.1.10, and fluent operator's custom image should be called v3.1.10.20240506 or v3.1.10.xx or v3.1.10+20240506 / v3.1.10-20240506

The change requires corresponding changes to the github workflow for build image which should be changed accordingly

cc @wanjunlei

joshuabaird commented 2 months ago

Ah yes - sorry, the tag was a typo. ghcr.io/fluent/fluent-operator/fluent-bit:v3.1.2 is based off of fluent/fluent-bit:3.1.2.

I agree about the versioning pattern. I'll try to get started on a PR soon to address this.

joshuabaird commented 2 months ago

As for the tagging logic, here is what I'm thinking:

Maintainer specifies 3.1.3 as the docker_tag_version in the workflow
If ghcr.io/fluent/fluent-operator/fluent-bit:3.1.3 already exists
- Do not tag 3.1.3 (we don't want to overwrite the existing image)
- Tag 3.1.3-XX (auto-incrementing patch version)
- Tag 3.1
- Tag latest
If ghcr.io/fluent/fluent-operator/fluent-bit:3.1.3 does not already exist
- Tag 3.1.3
- Tag 3.1.3-01
- Tag 3.1
- Tag latest

Any thoughts on this logic? Need to look into how we may implement auto-incrementing the patch tag when necessary.

benjaminhuo commented 2 months ago

As for the tagging logic, here is what I'm thinking:

Maintainer specifies 3.1.3 as the docker_tag_version in the workflow

If ghcr.io/fluent/fluent-operator/fluent-bit:3.1.3 already exists

Do not tag 3.1.3 (we don't want to overwrite the existing image)

Tag 3.1.3-XX (auto-incrementing patch version)

Tag 3.1

Tag latest

If ghcr.io/fluent/fluent-operator/fluent-bit:3.1.3 does not already exist

Tag 3.1.3

Tag 3.1.3-01

Tag 3.1

Tag latest

Any thoughts on this logic? Need to look into how we may implement auto-incrementing the patch tag when necessary.

cc @wenchajun @wanjunlei @sarathchandra24 @markusthoemmes what do you think?

joshuabaird commented 2 months ago

PR to implement the logic described above: https://github.com/fluent/fluent-operator/pull/1246

sarathchandra24 commented 2 months ago

As for the tagging logic, here is what I'm thinking:

Maintainer specifies 3.1.3 as the docker_tag_version in the workflow

If ghcr.io/fluent/fluent-operator/fluent-bit:3.1.3 already exists

Do not tag 3.1.3 (we don't want to overwrite the existing image)

Tag 3.1.3-XX (auto-incrementing patch version)

Tag 3.1

Tag latest

If ghcr.io/fluent/fluent-operator/fluent-bit:3.1.3 does not already exist

Tag 3.1.3

Tag 3.1.3-01

Tag 3.1

Tag latest

Any thoughts on this logic? Need to look into how we may implement auto-incrementing the patch tag when necessary.

cc @wenchajun @wanjunlei @sarathchandra24 @markusthoemmes what do you think?

LGTM; I thought 3.1.x -> x represents the patch version; but didn't realize it for synching the upstream. Agree with the approach as we want to maintain sync.