Closed freef4ll closed 1 year ago
Makes sense, but I don't think we're going to fix that. You have a manual workaround of changing permissions on the existing files to non-root user (see https://github.com/jaegertracing/jaeger/issues/4906#issuecomment-1849559579).
To play devil's advocate, how would you expect Jaeger to handle that? Changing to non-root user was the right choice from security perspective.
The other thing is we should call this out in the release notes as a breaking change and suggest remediation steps, like running chown - would you like to submit a PR for that?
The other thing is we should call this out in the release notes as a breaking change and suggest remediation steps, like running chown
Yes please. This is a breaking change, it should be in https://github.com/jaegertracing/jaeger/releases/tag/v1.50.0. I would submit a PR but i don't know what command i should run.
I think if the directory is added into the container and have permissions created for it like this:
FROM alpine
ARG USER_UID=10001
RUN mkdir /badger && chown ${USER_UID} /badger
USER ${USER_UID}
CMD ["/bin/ls", "-l", "/badger'"]
Then it works fine:
$ docker run -v badger-storage:/badger -ti test /bin/sh
~ $ cd /badger
/badger $ touch a
/badger $ ls -l
total 0
-rw-r--r-- 1 10001 root 0 Oct 31 09:10 a
Minimal PR created https://github.com/jaegertracing/jaeger/pull/4910; if you're happy with this then can update all other Dockerfiles.
updated changelog and release notes
thanks for reporting!
I'm not a Docker expert, how would I prevent this problem from occurring with my existing Docker-compose file?
services:
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger-demo
command: [ "--query.max-clock-skew-adjustment", "500ms" ]
environment:
- COLLECTOR_OTLP_ENABLED=true
- SPAN_STORAGE_TYPE=badger
- BADGER_EPHEMERAL=false
- BADGER_DIRECTORY_VALUE=/badger/data
- BADGER_DIRECTORY_KEY=/badger/key
volumes:
- jaeger_data:/badger
ports:
- "16686:16686" # UI
networks:
- default
volumes:
jaeger_data:
@miparnisari Have you found a way to fix this directly in the docker-compose file? I'm facing the same problem.
@t-pohl, no :( I just pinned the version of Jaeger.
@miparnisari I managed to upgrade using the following instructions:
Since version 1.50.0
of Jaeger (see https://github.com/jaegertracing/jaeger/releases/tag/v1.50.0) there is a breaking change which requires the external volume to be owned by a different user than root (which is the docker default). The user id is 10001. This requires to change the ownership of the volume folder on the host to the user 10001:
Example:
Find out the location of the jaeger volume on the host using docker volume inspect
docker volume ls
# Find the name of the volume
docker volume inspect traefik_jaeger_data
Start a root shell:
sudo bash
Change the volume ownwership:
cd /var/lib/docker/volumes/traefik_jaeger_data/ # Change to directory of volume on host
chown -R 10001:10001 _data/
Afterwards start jaeger and check the logs.
Maybe this can help you too.
Turns out you can set user: root
in docker compose to get it permissions working without having to chown
every time with an external script. I'm pretty sure that's not the best practice but for local development, that's dandy
I just tried an experiment, on MacOS with Docker Desktop (results could be different on different OS)
$ docker run \
-e SPAN_STORAGE_TYPE=badger \
-e BADGER_EPHEMERAL=false \
-e BADGER_DIRECTORY_VALUE=/badger/data \
-e BADGER_DIRECTORY_KEY=/badger/key \
-v $HOME/dev/badger-storage:/badger \
-p 16686:16686 \
jaegertracing/all-in-one:1.49
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
30eb40642743 jaegertracing/all-in-one:1.49 "/go/bin/all-in-one-…" 17 seconds ago Up 16 seconds 5775/udp, 5778/tcp, 14250/tcp, 6831-6832/udp, 14268/tcp, 0.0.0.0:16686->16686/tcp vigilant_brown
$ docker exec -it 30eb40642743 /bin/sh
/ # ps
PID USER TIME COMMAND
1 root 0:01 /go/bin/all-in-one-linux
16 root 0:00 /bin/sh
22 root 0:00 ps
/ # ls -lF /badger
total 0
drwx------ 5 root root 160 Feb 18 22:49 data/
drwx------ 6 root root 192 Feb 18 22:49 key/
/ # ls -lF /badger/data
total 1044
-rw-r--r-- 1 root root 2147483646 Feb 18 22:49 000001.vlog
-rw-r--r-- 1 root root 1048576 Feb 18 22:49 DISCARD
-rw-r--r-- 1 root root 2 Feb 18 22:49 LOCK
Here the container is running as root
and files are owned by root
$ docker run \
-e SPAN_STORAGE_TYPE=badger \
-e BADGER_EPHEMERAL=false \
-e BADGER_DIRECTORY_VALUE=/badger/data \
-e BADGER_DIRECTORY_KEY=/badger/key \
-v $HOME/dev/badger-storage:/badger \
-p 16686:16686 \
jaegertracing/all-in-one:1.53
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
422d701c2cc5 jaegertracing/all-in-one:1.53 "/go/bin/all-in-one-…" 3 seconds ago Up 2 seconds 4317-4318/tcp, 5775/udp, 5778/tcp, 9411/tcp, 14250/tcp, 14268/tcp, 6831-6832/udp, 0.0.0.0:16686->16686/tcp crazy_leakey
$ docker exec -it 422d701c2cc5 /bin/sh
/ $ ps
PID USER TIME COMMAND
1 10001 0:02 /go/bin/all-in-one-linux
16 10001 0:00 /bin/sh
22 10001 0:00 ps
/ $ ls -lF /badger
total 0
drwx------ 6 10001 root 192 Feb 18 22:51 data/
drwx------ 7 10001 root 224 Feb 18 22:51 key/
/ $ ls -lF /badger/data
total 1048
-rw-r--r-- 1 10001 root 20 Feb 18 22:51 000001.vlog
-rw-r--r-- 1 10001 root 2147483646 Feb 18 22:51 000002.vlog
-rw-r--r-- 1 10001 root 1048576 Feb 18 22:49 DISCARD
-rw-r--r-- 1 10001 root 2 Feb 18 22:51 LOCK
/ $ whoami
whoami: unknown uid 10001
/ $ touch /badger/data/test-file
/ $ ls -lF /badger/data/test-file
-rw-r--r-- 1 10001 root 0 Feb 18 22:53 /badger/data/test-file
Here the Jaeger process is running as 10001 and files are owned by 10001 (I am not sure how the latter happens). The directory is writable by that UID.
@yurishkuro
I enter the docker container(on rocky linux):
docker run -it -e SPAN_STORAGE_TYPE=badger -e BADGER_EPHEMERAL=false -e BADGER_DIRECTORY_VALUE=/badger/data -e BADGER_DIRECTORY_KEY=/badger/key -v /mnt/badger:/badger -p 16686:16686 --entrypoint "" jaegertracing/all-in-one:1.54 /bin/sh
all files' owned by root:root
. but whoami
show 10001.
@ivanallen did you try to chown it to the new user?
@yurishkuro
No, I didn't do anything. I executed the command in a completely new environment (rocky Linux). I tried it on my macOS and it was working fine. The problem may depend on the environment.
I temporarily rolled back to jaegertracing/all-in-one:1.49, which works fine in rocky Linux.
While you can chown existing dirs from earlier installations, how would you do that if you start from scratch? I may be missing something, but isn't that some kind of a hen-and-egg problem? Shouldn't there be a /bagder/data
directory which is already owned by 10001 in the container image from the beginning?
@MarkusGeigerDev if you are starting fresh without previous data there should be no issues at all. If you are still getting permission issues it's likely the target directory does not allow writes by the container irrespective of which user id is used inside the container.
Please provide reproducible example if you think otherwise, ie which OS, which containers runtime, which user it runs as, container command with volume mounting, and existing permissions on the parent directory on the host.
@yurishkuro Thanks for getting back to me! This is the relevant snippet from my docker-compose (running in WSL2/Windows)
jaeger:
image: jaegertracing/all-in-one:latest
command:
- "--badger.ephemeral=false"
- "--badger.directory-key=/badger/keys"
- "--badger.directory-value=/badger/values"
- "--badger.span-store-ttl=72h0m0s" # limit storage to 72hrs
environment:
- SPAN_STORAGE_TYPE=badger
volumes:
- jaeger_badger_data:/badger
ports:
- "16686:16686"
- "14250"
- "4317"
# user: root
volumes:
jaeger_badger_data:
The problem seems to be that the user 10001 can't even create the internal directories /badger/keys and /badger/values on startup:
Failed to init storage factory","error":"Error Creating Dir: \"/badger/keys\" error: mkdir /badger/keys: permission denied
When I set --badger.ephemeral=true
the service starts up (obviously) and I am able to docker exec -it
into it.
I looked around if there is any data path owned by 10001 already set up but I neither found one, nor was I able to create one.
When I remove the #-comment from the user: root
line, everything works fine - but of course that was only to prove my point, I really don't want to revert the otherwise good decision of running jaeger as non-root.
Did you create jaeger_badger_data
before running? What permissions does it have?
I would recommend running a shell inside the container and trying to create files (perhaps using different users), to see what's causing the permissions issue.
NB: this could be Windows specific.
# user: root
What if you try some other user that is palatable to Windows? The default 10001 in Jaeger Dockerfile is just "a suggestion", you don't have to run with it. But it would be nice to get to the bottom of this and add documentation.
@yurishkuro: I think I found a solution:
I added another service to my docker-compose that sets up everything before jaeger starts up:
version: "3.9"
services:
[...]
jaeger:
image: jaegertracing/all-in-one:latest
command:
- "--badger.ephemeral=false"
- "--badger.directory-key=/badger/data/keys"
- "--badger.directory-value=/badger/data/values"
- "--badger.span-store-ttl=72h0m0s" # limit storage to 72hrs
environment:
- SPAN_STORAGE_TYPE=badger
volumes:
- jaeger_badger_data:/badger
ports:
- "16686:16686"
- "14250"
- "4317"
depends_on:
init-jaeger:
condition: service_completed_successfully
init-jaeger:
user: root
image: busybox
command: "/bin/sh -c 'mkdir -p /badger/data && touch /badger/data/.initialized && chown -R 10001:10001 /badger/data'"
volumes:
- jaeger_badger_data:/badger
volumes:
jaeger_badger_data:
The init-jaeger service still runs as root (to be able to chown
something), but since it will be already terminated when jaeger starts up, it shouldn't add anything to the attack surface of the system. If you like the solution, feel free to add it to the docs. Perhaps you may want to exchange busybox for whatever operating system base layer jaeger is using to avoid the additional download.(reuse of the already pulled layer), but on the other hand, busybox isn't that big either.
@MarkusGeigerDev thanks! I suspect you can use the jaeger image itself, it has shell in it. Also, did you try to create just the /badger dir? I think it should be sufficient and less error prone (your solution seems to rely on some understanding of internal organization of badger sub-dirs)
@yurishkuro
I suspect you can use the jaeger image itself, it has shell in it.
Haha, you're right, this works as well :)
init-jaeger:
user: root
image: jaegertracing/all-in-one:latest
entrypoint: "/bin/sh -c 'mkdir -p /badger/data && touch /badger/data/.initialized && chown -R 10001:10001 /badger/data'"
volumes:
- jaeger_badger_data:/badger
Also, did you try to create just the /badger dir? I think it should be sufficient and less error prone (your solution seems to rely on some understanding of internal organization of badger sub-dirs)
/badger
is really just the mount point. It corresponds to the root dir in the volume, which I didn't want to change.
I have no further knowledge about badger, besides what I took from the jaeger CLI docs which seem to suggest that badger needs two directories, one for the keys and one for the values. Since I was too lazy to chown
those two individually, I introduced /data/
as a common root for both, keys
and values
, and used the --badger.directory-key
and --badger.directory-value
CLI switches to push badger that one level down.
EDIT:
Oh, BTW, I totally forgot to answer your other suggestions - sorry for that:
What if you try some other user that is palatable to Windows? The default 10001 in Jaeger Dockerfile is just "a suggestion", you don't have to run with it. But it would be nice to get to the bottom of this and add documentation.
You cannot easily map users from a Linux container to Windows users. Windows uses a very different technical scheme for users and groups called SIDs. For example, the "root" group ("Administrators", actually) is S-1-5-32-544
.
But in the end, all that doesn't matter, since we use the Linux dockerd
anyways. All containers physically run in the Windows Subsystem for Linux, on a real Linux kernel and with an EXT4 file system. That's why in my solution the chown -R 10001:10001
works.
Documented in #5282
@MarkusGeigerDev @yurishkuro I am getting the same error when doing the install using helm.
This is my values.yaml file -
query:
initContainers:
name: init-jaeger
image: jaegertracing/all-in-one:latest
command: ["/bin/sh", "-c"]
args: ["mkdir -p /badger/data && touch /badger/data/.initialized && chown -R 10001:10001 /badger/data"]
volumeMounts:
- name: jaeger-badger-data
mountPath: /badger
securityContext:
fsGroup: 10001
volumes:
- name: jaeger-badger-data
emptyDir: {}`
helm chart - https://raw.githubusercontent.com/hansehe/jaeger-all-in-one/master/helm/charts version - 0.1.12
I have tried using the root user as well and nothing changed.
Error details -
{"level":"fatal","ts":1715098835.9268656,"caller":"all-in-one/main.go:114","msg":"Failed to init storage factory","error":"Error Creating Dir: \"/badger/key\" error: mkdir /badger/key: permission denied","stacktrace":"main.main.func1\n\tgithub.com/jaegertracing/jaeger/cmd/all-in-one/main.go:114\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.8.0/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.8.0/command.go:1039\nmain.main\n\tgithub.com/jaegertracing/jaeger/cmd/all-
in-one/main.go:249\nruntime.main\n\truntime/proc.go:271"}`
What is the issue here?
What happened?
With the release of 1.50, documented badger default path is no longer writable as it is owned by root after the changes in https://github.com/jaegertracing/jaeger/pull/4783
Steps to reproduce
Expected behavior
Jaeger to start up
Relevant log output
No response
Screenshot
No response
Additional context
No response
Jaeger backend version
1.50
SDK
No response
Pipeline
No response
Stogage backend
No response
Operating system
No response
Deployment model
No response
Deployment configs
No response