influxdata / influxdata-docker

Official docker images for the influxdata stack
327 stars 248 forks source link

Permission denied mounting telegraf.conf when upgrading past 1.29.4 #758

Closed guitarpicker closed 4 months ago

guitarpicker commented 4 months ago

I encountered an error when updating my Docker containers for telegraf to the latest images (1.31.0): "error loading config file /etc/telegraf/telegraf.conf: open /etc/telegraf/telegraf.conf: permission denied"

After some tests, I determined that the problem first appeared in 1.29.5. It doesn't matter if I run regular or alpine.

Here is some sample output with a minimal config file running on 1.29.4 which works and 1.29.5 and the latest 1.31.0 which break. Similar results occur for other later versions past 1.29.4.

# cat telegraf.conf.minimal
[[inputs.cpu]]
[[outputs.file]]
# docker run --rm -v ./telegraf.conf.minimal:/etc/telegraf/telegraf.conf telegraf:1.29.4
2024-07-01T22:33:27Z I! Loading config: /etc/telegraf/telegraf.conf
2024-07-01T22:33:27Z I! Starting Telegraf 1.29.4 brought to you by InfluxData the makers of InfluxDB
2024-07-01T22:33:27Z I! Available plugins: 241 inputs, 9 aggregators, 30 processors, 24 parsers, 60 outputs, 6 secret-stores
2024-07-01T22:33:27Z I! Loaded inputs: cpu
2024-07-01T22:33:27Z I! Loaded aggregators:
2024-07-01T22:33:27Z I! Loaded processors:
2024-07-01T22:33:27Z I! Loaded secretstores:
2024-07-01T22:33:27Z I! Loaded outputs: file
2024-07-01T22:33:27Z I! Tags enabled: host=74e1cec815ed
2024-07-01T22:33:27Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"74e1cec815ed", Flush Interval:10s
^C2024-07-01T22:33:34Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-07-01T22:33:34Z I! [agent] Stopping running outputs

# docker run --rm -v ./telegraf.conf.minimal:/etc/telegraf/telegraf.conf telegraf:1.29.5
2024-07-01T22:33:41Z I! Loading config: /etc/telegraf/telegraf.conf
2024-07-01T22:33:41Z E! error loading config file /etc/telegraf/telegraf.conf: open /etc/telegraf/telegraf.conf: permission denied

# docker run --rm -v ./telegraf.conf.minimal:/etc/telegraf/telegraf.conf telegraf:latest
2024-07-01T22:40:04Z I! Loading config: /etc/telegraf/telegraf.conf
2024-07-01T22:40:04Z E! error loading config file /etc/telegraf/telegraf.conf: open /etc/telegraf/telegraf.conf: permission denied

I'm running on RHEL8 using Docker from the EPEL repos. I have tried with SELinux off (setenforce 0) and see no difference. I've tried with and without the :ro mount flags as well.

Version info:

# docker images telegraf:1.29.[45]
REPOSITORY   TAG       IMAGE ID       CREATED        SIZE
telegraf     1.29.5    b4c163b7f1bd   3 weeks ago    455MB
telegraf     1.29.4    25bc88f8a418   4 months ago   452MB
# docker images telegraf:latest
REPOSITORY   TAG       IMAGE ID       CREATED       SIZE
telegraf     latest    055d4d585a83   3 weeks ago   467MB

# docker version
Client: Docker Engine - Community
 Version:           26.1.3
 API version:       1.45
 Go version:        go1.21.10
 Git commit:        b72abbb
 Built:             Thu May 16 08:34:39 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.1.3
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.10
  Git commit:       8e96db1
  Built:            Thu May 16 08:33:34 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.32
  GitCommit:        8b3b7ca2e5ce38e8f31a34f35b2b68ceb8470d89
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
powersj commented 4 months ago

Can you run stat telegraf.conf.minimal please?

guitarpicker commented 4 months ago
# stat telegraf.conf.minimal
  File: telegraf.conf.minimal
  Size: 32              Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 67521547    Links: 1
Access: (0640/-rw-r-----)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2024-07-01 18:28:55.114532672 -0400
Modify: 2024-07-01 18:28:52.031495085 -0400
Change: 2024-07-01 18:28:52.034495122 -0400
 Birth: 2024-07-01 18:28:52.031495085 -0400
powersj commented 4 months ago

(0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root)

I would not expect this to ever work ;) you are mounting this into a container where Telegraf is run as the telegraf:telegraf user/group (by default). Other has no permissions and the file is owned by root/root.

Other needs read permissions (644).

❯ cat config.toml 
[agent]
  debug = true
  omit_hostname = true

[[outputs.file]]
[[inputs.cpu]]

❯ chmod o-r config.toml
❯ stat config.toml 
  File: config.toml
  Size: 80          Blocks: 8          IO Block: 4096   regular file
Device: 0,46    Inode: 10289       Links: 1
Access: (0640/-rw-r-----)  Uid: ( 1000/ powersj)   Gid: ( 1000/ powersj)
Access: 2024-07-02 08:18:22.433456589 -0600
Modify: 2024-06-28 10:41:46.634869628 -0600
Change: 2024-07-02 08:18:30.043498474 -0600
 Birth: 2024-03-30 09:03:12.658034043 -0600
❯ docker run -it --rm -v ./config.toml:/etc/telegraf/telegraf.conf telegraf:latest
Unable to find image 'telegraf:latest' locally
latest: Pulling from library/telegraf
e9aef93137af: Pull complete 
58b365fa3e8d: Pull complete 
73640a629d5c: Pull complete 
7b734f825ca4: Pull complete 
5592f5a587d7: Pull complete 
b1345e4c69b0: Pull complete 
Digest: sha256:cb10bf03a9f426cf9e34dfb9d4c3bf2b0a7b2cc66b5a32267423ce69a8f96ac1
Status: Downloaded newer image for telegraf:latest
2024-07-02T14:19:13Z I! Loading config: /etc/telegraf/telegraf.conf
2024-07-02T14:19:13Z E! error loading config file /etc/telegraf/telegraf.conf: open /etc/telegraf/telegraf.conf: permission denied
❯ chmod o+r config.toml
❯ stat config.toml 
  File: config.toml
  Size: 80          Blocks: 8          IO Block: 4096   regular file
Device: 0,46    Inode: 10289       Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/ powersj)   Gid: ( 1000/ powersj)
Access: 2024-07-02 08:18:22.433456589 -0600
Modify: 2024-06-28 10:41:46.634869628 -0600
Change: 2024-07-02 08:19:23.427114538 -0600
 Birth: 2024-03-30 09:03:12.658034043 -0600
❯ docker run -it --rm -v ./config.toml:/etc/telegraf/telegraf.conf telegraf:latest
2024-07-02T14:19:28Z I! Loading config: /etc/telegraf/telegraf.conf
2024-07-02T14:19:28Z I! Starting Telegraf 1.31.1 brought to you by InfluxData the makers of InfluxDB
2024-07-02T14:19:28Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-07-02T14:19:28Z I! Loaded inputs: cpu
2024-07-02T14:19:28Z I! Loaded aggregators: 
2024-07-02T14:19:28Z I! Loaded processors: 
2024-07-02T14:19:28Z I! Loaded secretstores: 
2024-07-02T14:19:28Z I! Loaded outputs: file
2024-07-02T14:19:28Z I! Tags enabled: 
2024-07-02T14:19:28Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"", Flush Interval:10s
2024-07-02T14:19:28Z D! [agent] Initializing plugins
2024-07-02T14:19:28Z D! [agent] Connecting outputs
2024-07-02T14:19:28Z D! [agent] Attempting connection to [outputs.file]
2024-07-02T14:19:28Z D! [agent] Successfully connected to outputs.file
2024-07-02T14:19:28Z D! [agent] Starting service inputs
^C2024-07-02T14:19:29Z D! [agent] Stopping service inputs
2024-07-02T14:19:29Z D! [agent] Input channel closed
2024-07-02T14:19:29Z I! [agent] Hang on, flushing any cached metrics before shutdown
2024-07-02T14:19:29Z D! [outputs.file]  Buffer fullness: 0 / 10000 metrics
2024-07-02T14:19:29Z I! [agent] Stopping running outputs
2024-07-02T14:19:29Z D! [agent] Stopped Successfully

I did try with 1.29.4 as well and got the same error:

❯ stat config.toml 
  File: config.toml
  Size: 80          Blocks: 8          IO Block: 4096   regular file
Device: 0,46    Inode: 10289       Links: 1
Access: (0640/-rw-r-----)  Uid: ( 1000/ powersj)   Gid: ( 1000/ powersj)
Access: 2024-07-02 08:19:28.500473813 -0600
Modify: 2024-06-28 10:41:46.634869628 -0600
Change: 2024-07-02 08:21:57.587847554 -0600
 Birth: 2024-03-30 09:03:12.658034043 -0600
❯ docker run -it --rm -v ./config.toml:/etc/telegraf/telegraf.conf telegraf:1.29.4
Unable to find image 'telegraf:1.29.4' locally
1.29.4: Pulling from library/telegraf
7bb465c29149: Pull complete 
2b9b41aaa3c5: Pull complete 
c7c71dd3592a: Pull complete 
9140cc5510d6: Pull complete 
aab5bc94bab0: Pull complete 
6396348f0ac2: Pull complete 
Digest: sha256:d883b097fbbb1ed1db5fb1430a2d767ab72b423cf3cbb065bb274ff030d6311d
Status: Downloaded newer image for telegraf:1.29.4
2024-07-02T14:22:13Z I! Loading config: /etc/telegraf/telegraf.conf
2024-07-02T14:22:13Z E! error loading config file /etc/telegraf/telegraf.conf: open /etc/telegraf/telegraf.conf: permission denied
guitarpicker commented 4 months ago

Thanks. I'm not sure why these file permissions worked on my system in prior versions, but chmod 644 telegraf.conf.minimal got it working on all versions and more importantly on the actual config file I'm dealing with.

I guess I need to brush up on Docker's file abstraction and permissions between the host and guest. I'm glad it was something simple, albeit I'm still a bit confused as to what triggered the change, as I would have expected this to break a couple of years ago.

Thanks for the quick response and resolution.