jaegertracing / jaeger-clickhouse

Jaeger ClickHouse storage plugin implementation
Apache License 2.0
233 stars 50 forks source link

Dockerizing proposal #102

Open levonet opened 2 years ago

levonet commented 2 years ago

I have a number of proposals that I can make to this project:

  1. Two-stage build in docker. In this way, we will have a build in a reproducible environment.
  2. Optimized linking as much as possible. The image is required for production use. At high loads, even minor optimizations save resources.
  3. Build plugin along with Jaeger source code. In this way, we will influence the optimization of the building of Jaeger. We can use cache or saved docker levels to speed up building.
  4. Use Debian releases instead of Alpine distributive. One of the optimizations is linking with system libraries. Alpine has limited multithreading functionality due to the use of musl instead of libc. But there is no problem supporting both distributions.
  5. Image versioning that includes the Jaeger version, the plugin version, and the label that this container contains the plugin. The same approach is used by snyk. For example, the image will have the following tags:
    • ghcr.io/jaegertracing/jaeger-collector:1.29.0-clickhouse-0.8.0-stretch
    • ghcr.io/jaegertracing/jaeger-collector:1.29.0-clickhouse-0.8.0
    • ghcr.io/jaegertracing/jaeger-collector:1.29.0-clickhouse
    • ghcr.io/jaegertracing/jaeger-collector:clickhouse-0.8.0-stretch
    • ghcr.io/jaegertracing/jaeger-collector:clickhouse-0.8.0
    • ghcr.io/jaegertracing/jaeger-collector:clickhouse
  6. Have a complete set of images of own production: all-in-one, jaeger-agent, jaeger-collector, jaeger-ingester, jaeger-query.
  7. Run E2E-tests using docker-compose. example.

The implementation of part of the above can be found in this project https://github.com/levonet/docker-jaeger. I'm ready to move this infrastructure and do support by my team during the time of using Jaeger.

nickbp commented 2 years ago

For 3/5/6, the jaeger-clickhouse plugin is a standalone binary that gets executed as a subprocess of Jaeger, which is then communicated with over gRPC. As a result of this design, the docker image really only needs to hold the jaeger-clickhouse binary itself, which can then be mounted into the Jaeger container's filesystem. For example see handling in jaeger-operator that sets up access to the binary using an initContainer. Switching to a model of including both Jaeger and the plugin in the same image would require first updating this existing gRPC handling in jaeger-operator to support the combined image structure. There's also be the loss of independent versioning between the plugin itself and Jaeger in a production deployment - e.g. when testing changes to the plugin I am able to pair an arbitrary plugin version with an arbitrary Jaeger version.

But there are still a couple other issues in the existing structure that you point out:

  1. The current Dockerfile copies in a built binary that was created externally on the host, rather than via a controlled builder image. Like you point out in item (1) this makes it harder to get reproducible builds.
  2. As you pointed out in item (4), once the binary is being built internally, it should probably use Debian rather than Alpine for the base images. The current use of Alpine effectively doesn't matter that much since the binary isn't actually being built nor run in that Alpine environment. If we were going to extremes, the main stage's base image could technically be scratch, except that it's often useful to keep the linux utilities available for debugging etc. But once a builder stage has been added, we'd probably want to make sure that there isn't a risk of weird libc vs musl conflicts when the imported plugin binary is being run from a jaeger image.

So I think the main changes to address items 1 and 4 would be:

Does this make sense?