docker-archive / infra-container_exporter

Prometheus exporter exposing container metrics
126 stars 43 forks source link

No stats on centos 7 #1

Open LordFPL opened 9 years ago

LordFPL commented 9 years ago

Hello,

I have a problem on centos 7 : all stats are null. I have try launch with "-v /sys/fs/cgroup:/cgroup" with no parent. Here is two tests :

./container-exporter -parent='/inexistent_path'
2015/02/02 10:22:44 Starting Server: :8080

No problem... all containers are displayed on :8080... but no values... (all are "0")

And now without a "/" :

./container-exporter -parent='inexistent_path'
2015/02/02 10:26:40 Starting Server: :8080
2015/02/02 10:26:42 Error reading container stats: open /cgroup/cpuacct/system.slice/docker-7543eec9fafc0ba3bd1892b684ddab2af0e760debb893b61d9da1e3736e8f2af.scope/inexistent_path/7543eec9fafc0ba3bd1892b684ddab2af0e760debb893b61d9da1e3736e8f2af/cpuacct.stat: no such file or directory

Why the container.ID is append again to "/cgroup/cpuacct/system.slice/docker-7543eec9fafc0ba3bd1892b684ddab2af0e760debb893b61d9da1e3736e8f2af.scope/{$parent}/ ?

Thx in advance for any clue/answer :)

dwitzig commented 9 years ago

Hi, did you have any luck with this? I get the same on both ubuntu an CoreOS docker version 1.4.1

my docker run command is

docker run -p 9104:9104 -v /sys/fs/cgroup:/cgroup -v /var/run/docker.sock:/var/run/docker.sock prom/container-exporter

when I check :9104/metrics all values are 0, for example:

# TYPE container_cpu_throttled_time_seconds_total counter
container_cpu_throttled_time_seconds_total{id="644d52456c158ded25f20407902b7af8ac26d3e3ee116eb2a53485911a62cce9",image="prom/container-exporter:latest",name="pensive_newton"} 0
container_cpu_throttled_time_seconds_total{id="bad19b4be14026219169a9ddc242dc4cd67a9a2f79676bfc7fe984b2ea2fc4e7",image="sameersbn/redmine:2.5.3",name="redmine"} 0
container_cpu_throttled_time_seconds_total{id="d783dc9d5bbb1e8be0c88efcf45867e528b1655ababfebfc55ce768e6adbf4c9",image="sameersbn/mysql:latest",name="mysql"} 0
# HELP container_cpu_usage_seconds_total Total seconds of cpu time consumed.
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{id="644d52456c158ded25f20407902b7af8ac26d3e3ee116eb2a53485911a62cce9",image="prom/container-exporter:latest",name="pensive_newton",type="kernel"} 0
container_cpu_usage_seconds_total{id="644d52456c158ded25f20407902b7af8ac26d3e3ee116eb2a53485911a62cce9",image="prom/container-exporter:latest",name="pensive_newton",type="user"} 0
container_cpu_usage_seconds_total{id="bad19b4be14026219169a9ddc242dc4cd67a9a2f79676bfc7fe984b2ea2fc4e7",image="sameersbn/redmine:2.5.3",name="redmine",type="kernel"} 0
container_cpu_usage_seconds_total{id="bad19b4be14026219169a9ddc242dc4cd67a9a2f79676bfc7fe984b2ea2fc4e7",image="sameersbn/redmine:2.5.3",name="redmine",type="user"} 0
container_cpu_usage_seconds_total{id="d783dc9d5bbb1e8be0c88efcf45867e528b1655ababfebfc55ce768e6adbf4c9",image="sameersbn/mysql:latest",name="mysql",type="kernel"} 0
container_cpu_usage_seconds_total{id="d783dc9d5bbb1e8be0c88efcf45867e528b1655ababfebfc55ce768e6adbf4c9",image="sameersbn/mysql:latest",name="mysql",type="user"} 0
LordFPL commented 9 years ago

No luck sorry... but not get enough time to try to understand code :(

discordianfish commented 9 years ago

Hi @LordFPL and @dwitzig,

I've just updated the container_exporter to the latest libcontainer version. Could you try if you can still reproduce this issue? It's probably due to systemd causing Docker to use a different cgroup path.

dwitzig commented 9 years ago

Hi discordianfish, thanks for the update. Unfortunately it did not fix the issue. most container_* metrics still show as 0. you can see results here http://104.236.149.72:9104/metrics I made sure to run "docker pull prom/container-exporter" first. tested on both coreOS & Ubuntu

skelethan commented 9 years ago

Hi, I have the same issue on CentOS 6.6, Docker 1.4.1

A couple interesting data points:

  1. cgroup is located at /cgroup instead of /sys/fs/cgroup. Easily adjusted but you mentioned cgroup location in another comment
docker run -d --name pce -p 10.10.10.42:9104:9104 -v /cgroup:/cgroup -v /var/run/docker.sock:/var/run/docker.sock prom/container-exporter
  1. The exporter is aware of which containers are running, but all metrics are still 0. i.e. I have a prom/prometheus, prom/container-exporter & tomcat:8.0 all running and all show up at 10.10.10.42:9104/metrics (see below)
  2. Finally the container_last_seen metric has a strange value for all 3 containers.
container_last_seen{id="6ad10c950a875852c939485056de994642325ff6a1f60b987b7d0833bd0c1d7d",image="prom/prometheus:latest",name="prometheus"} 1.424300325e+09
container_last_seen{id="83d949b970de905518c1a0a8240d85192878a2d2e03dc7d5131b49b04e2fa711",image="prom/container-exporter:latest",name="pce"} 1.424300325e+09
container_last_seen{id="bb4df633645e5c0572f797c59451e399a51aadd50edd25544f3716a7fd6b16fc",image="tomcat:8.0",name="tomcat"} 1.424300325e+09

Let me know what I can do to help investigate. Cheers!

discordianfish commented 9 years ago

I bet you all are using systemd, right? Docker somehow uses some prefixes in that case. We could also filter them but it's probably better to use the newly introduced stats available via the API instead. I just have no time to implement that right now but PRs are welcome. Otherwise I'll probably get to it sometime next week.

skelethan commented 9 years ago

Interesting. I don't see systemd in the process tree, but perhaps Docker uses it in some way I don't understand.

I assume you mean the API introduced in docker 1.5, which seems like a good forward facing enhancement! That will not solve issue for those stuck on prior versions (e.g. RHEL is very slow at certifying & releasing updates - still stuck on 1.3.2). May not be a material demographic, but worth mentioning.

I wish I could help with a PR but I am not yet fluent in Go. Will look forward to an update when you get to it. Thanks.

discordianfish commented 9 years ago

Hi everyone,

can you try again? I just added something to 'guess' the cgroup paths correctly, hopefully that fixed the issue.

If not, please check where cgroups are mounted on your system (mount|grep cgroup), then do a find /path/to/cgroup and paste the results somewhere.

skelethan commented 9 years ago

Hi, I tried again, but I see the same results. I am not seeing the update in dockerhub or github, am I missing something?

dockerhub = Updated 3 days, 21 hours ago github = authored 4 days ago

skelethan commented 9 years ago

Couple interesting data points

Docker 1.4.1 on CentOS 6.6 - strange no cgroups mounted

[root@localhost vagrant]# mount|grep cgroup
[root@localhost vagrant]# find / -type d -name "cgroup"
/usr/src/kernels/2.6.32-504.8.1.el6.x86_64/include/config/cgroup
/cgroup

Docker 1.5 on Ubuntu 14.04 - same issue of no metrics, but cgroups mounted to /sys/fs/cgroup as expected

root@vagrant-ubuntu-trusty-64:~# mount|grep cgroup
none on /sys/fs/cgroup type tmpfs (rw)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
root@vagrant-ubuntu-trusty-64:~# find / -type d -name "cgroup"
/usr/src/linux-headers-3.13.0-40-generic/include/config/cgroup
/usr/src/linux-headers-3.13.0-40/tools/cgroup
/sys/fs/cgroup
/cgroup
/var/lib/docker/aufs/diff/48aff7db7ae258d93771cbc363325d5755dcdde8912eb979371f7b2f6dc84ff4/cgroup
/var/lib/docker/aufs/mnt/48aff7db7ae258d93771cbc363325d5755dcdde8912eb979371f7b2f6dc84ff4/cgroup
discordianfish commented 9 years ago

@jonathanmhamilton Sorry, forgot to push the changes. Try again.

Ubuntu looks good but CentOS should have cgroups mounted somewhere as well. You're sure that docker even works there? ;)

If things still don't work, send me the output of 'find /sys/fs/cgroup' from the ubuntu system. But for the centos system, no clue why it's missing the mount.

skelethan commented 9 years ago

Ubuntu looks good but CentOS should have cgroups mounted somewhere as well. You're sure that docker even works there? ;)

I know! I don't like to see "magic" in technology either =) While I don't know how Docker works there, I do know that it does appear to be working...

Pulled the project & tried to run docker build from my Ubuntu VM so that I could test the fix. Complaining about some sort of dependency. Is there anything that I need to pull or set locally (other than the master branch of course) before running this particular Dockerfile?

Step 6 : RUN go get -d && go build
 ---> Running in 9ee5cded063e
package github.com/docker/libcontainer/cgroups/fs
        imports github.com/docker/docker/pkg/mount
        imports github.com/fsouza/go-dockerclient
        imports github.com/Sirupsen/logrus
        imports github.com/golang/glog
        imports github.com/google/cadvisor/container/libcontainer
        imports github.com/coreos/go-systemd/dbus
        imports github.com/godbus/dbus
        imports github.com/syndtr/gocapability/capability
        imports github.com/docker/libcontainer/network
        imports github.com/docker/libcontainer/network
        imports github.com/docker/libcontainer/network: cannot find package "github.com/docker/libcontainer/network" in any of:
        /usr/src/go/src/github.com/docker/libcontainer/network (from $GOROOT)
        /gopath/src/github.com/docker/libcontainer/network (from $GOPATH)
        /gopath/src/github.com/docker-infra/container-exporter/_vendor/src/github.com/docker/libcontainer/network
INFO[0143] The command [/bin/sh -c go get -d && go build] returned a non-zero code: 1
dwitzig commented 9 years ago

I also run into a issue when trying to build.

Step 6 : RUN go get -u -d && go build
 ---> Running in 23d3ed1ed0f5
package github.com/docker-infra/container-exporter: github.com/docker-infra/container-exporter is a custom import path for https://github.com/docker-infra/container-exporter, but /go/src/github.com/docker-infra/container-exporter is checked out from https://github.com/docker-infra/container_exporter.git
INFO[0000] The command [/bin/sh -c go get -u -d && go build] returned a non-zero code: 1

if I build directly from Dockerfile on github i get this error

Step 6 : RUN go get -u -d && go build
 ---> Running in 80366c2b06d5
package github.com/docker-infra/container-exporter: directory "/go/src/github.com/docker-infra/container-exporter" is not using a known version control system
INFO[0004] The command [/bin/sh -c go get -u -d && go build] returned a non-zero code: 1
dwitzig commented 9 years ago

build issue caused by incorrect paths in docker and cadvisor I have create issues in both repos here https://github.com/docker/docker/issues/10976 https://github.com/google/cadvisor/issues/533

discordianfish commented 9 years ago

@dwitzig Actually it's caused by me importing two different libcontainer versions by accident. Looks like my vim-go added this. Anyways, but in the meanwhile there were again breaking changes and I don't have time to keep up with them right now. I'll try to allocate some time in the next days to convert all this.

discordianfish commented 9 years ago

Ok, I've updated it to build against the most recent libcontainer version. Can you please try to reproduce your problems?

skelethan commented 9 years ago

Just pulled down the new containers & restarted but am seeing the same issue =( Ubuntu 14.04 / Docker 1.5

Are there any other logs or data points that I can provide? Is there any "trigger" that needs to be hit for the container metrics to be registered? It's still strange to me that the "container last seen" is picking up info but none of the other metrics are. Even if I start up a new container after container-exporter is running, it picks up that container right away on "container last seen"

# HELP container_cpu_throttled_periods_total Number of periods with throttling.
# TYPE container_cpu_throttled_periods_total counter
container_cpu_throttled_periods_total{id="3a4040b6b7214fd0dbe67d6ddbe208323e0b9f0dc94621c935346649ae18ba2e",image="prom/container-exporter:latest",name="pce",state="throttled"} 0
container_cpu_throttled_periods_total{id="3a4040b6b7214fd0dbe67d6ddbe208323e0b9f0dc94621c935346649ae18ba2e",image="prom/container-exporter:latest",name="pce",state="total"} 0
container_cpu_throttled_periods_total{id="71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818",image="prom/promdash:latest",name="promdash",state="throttled"} 0
container_cpu_throttled_periods_total{id="71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818",image="prom/promdash:latest",name="promdash",state="total"} 0
container_cpu_throttled_periods_total{id="bc62a18f03c726b4b26f4a666a9592f160a26829be3ae184e73a5b75efe59ee0",image="mysql:5.7",name="pd_mysql",state="throttled"} 0
container_cpu_throttled_periods_total{id="bc62a18f03c726b4b26f4a666a9592f160a26829be3ae184e73a5b75efe59ee0",image="mysql:5.7",name="pd_mysql",state="total"} 0
container_cpu_throttled_periods_total{id="de1f3c07e52b71eba5a70b51dfb80ee670b0e37a0debb36a3d5c474de2517720",image="busybox:latest",name="focused_blackwell",state="throttled"} 0
container_cpu_throttled_periods_total{id="de1f3c07e52b71eba5a70b51dfb80ee670b0e37a0debb36a3d5c474de2517720",image="busybox:latest",name="focused_blackwell",state="total"} 0
container_cpu_throttled_periods_total{id="e42698653ef69273860bad9925e89400f0e14a61481b35a71edb7843472273dd",image="prom/prometheus:latest",name="prometheus",state="throttled"} 0
container_cpu_throttled_periods_total{id="e42698653ef69273860bad9925e89400f0e14a61481b35a71edb7843472273dd",image="prom/prometheus:latest",name="prometheus",state="total"} 0
# HELP container_cpu_throttled_time_seconds_total Aggregate time the container was throttled for in seconds.
# TYPE container_cpu_throttled_time_seconds_total counter
container_cpu_throttled_time_seconds_total{id="3a4040b6b7214fd0dbe67d6ddbe208323e0b9f0dc94621c935346649ae18ba2e",image="prom/container-exporter:latest",name="pce"} 0
container_cpu_throttled_time_seconds_total{id="71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818",image="prom/promdash:latest",name="promdash"} 0
container_cpu_throttled_time_seconds_total{id="bc62a18f03c726b4b26f4a666a9592f160a26829be3ae184e73a5b75efe59ee0",image="mysql:5.7",name="pd_mysql"} 0
container_cpu_throttled_time_seconds_total{id="de1f3c07e52b71eba5a70b51dfb80ee670b0e37a0debb36a3d5c474de2517720",image="busybox:latest",name="focused_blackwell"} 0
container_cpu_throttled_time_seconds_total{id="e42698653ef69273860bad9925e89400f0e14a61481b35a71edb7843472273dd",image="prom/prometheus:latest",name="prometheus"} 0
# HELP container_cpu_usage_seconds_total Total seconds of cpu time consumed.
# TYPE container_cpu_usage_seconds_total counter
container_cpu_usage_seconds_total{id="3a4040b6b7214fd0dbe67d6ddbe208323e0b9f0dc94621c935346649ae18ba2e",image="prom/container-exporter:latest",name="pce",type="kernel"} 0
container_cpu_usage_seconds_total{id="3a4040b6b7214fd0dbe67d6ddbe208323e0b9f0dc94621c935346649ae18ba2e",image="prom/container-exporter:latest",name="pce",type="user"} 0
container_cpu_usage_seconds_total{id="71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818",image="prom/promdash:latest",name="promdash",type="kernel"} 0
container_cpu_usage_seconds_total{id="71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818",image="prom/promdash:latest",name="promdash",type="user"} 0
container_cpu_usage_seconds_total{id="bc62a18f03c726b4b26f4a666a9592f160a26829be3ae184e73a5b75efe59ee0",image="mysql:5.7",name="pd_mysql",type="kernel"} 0
container_cpu_usage_seconds_total{id="bc62a18f03c726b4b26f4a666a9592f160a26829be3ae184e73a5b75efe59ee0",image="mysql:5.7",name="pd_mysql",type="user"} 0
container_cpu_usage_seconds_total{id="de1f3c07e52b71eba5a70b51dfb80ee670b0e37a0debb36a3d5c474de2517720",image="busybox:latest",name="focused_blackwell",type="kernel"} 0
container_cpu_usage_seconds_total{id="de1f3c07e52b71eba5a70b51dfb80ee670b0e37a0debb36a3d5c474de2517720",image="busybox:latest",name="focused_blackwell",type="user"} 0
container_cpu_usage_seconds_total{id="e42698653ef69273860bad9925e89400f0e14a61481b35a71edb7843472273dd",image="prom/prometheus:latest",name="prometheus",type="kernel"} 0
container_cpu_usage_seconds_total{id="e42698653ef69273860bad9925e89400f0e14a61481b35a71edb7843472273dd",image="prom/prometheus:latest",name="prometheus",type="user"} 0
# HELP container_last_seen Last time a container was seen by the exporter
# TYPE container_last_seen counter
container_last_seen{id="3a4040b6b7214fd0dbe67d6ddbe208323e0b9f0dc94621c935346649ae18ba2e",image="prom/container-exporter:latest",name="pce"} 1.424964082e+09
container_last_seen{id="71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818",image="prom/promdash:latest",name="promdash"} 1.424964082e+09
container_last_seen{id="bc62a18f03c726b4b26f4a666a9592f160a26829be3ae184e73a5b75efe59ee0",image="mysql:5.7",name="pd_mysql"} 1.424964082e+09
container_last_seen{id="de1f3c07e52b71eba5a70b51dfb80ee670b0e37a0debb36a3d5c474de2517720",image="busybox:latest",name="focused_blackwell"} 1.424964082e+09
container_last_seen{id="e42698653ef69273860bad9925e89400f0e14a61481b35a71edb7843472273dd",image="prom/prometheus:latest",name="prometheus"} 1.424964082e+09
discordianfish commented 9 years ago

@jonathanmhamilton Can you please send me the following outputs:

skelethan commented 9 years ago

docker info

root@vagrant-ubuntu-trusty-64:~# docker info
Containers: 5
Images: 108
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 118
Execution Driver: native-0.2
Kernel Version: 3.13.0-40-generic
Operating System: Ubuntu 14.04.1 LTS
CPUs: 1
Total Memory: 994 MiB
Name: vagrant-ubuntu-trusty-64
ID: BL6Z:NXJV:UUN2:4HZH:O2LZ:L5SF:FC34:TOG3:FBQA:RI5D:F4HZ:YL7P
WARNING: No swap limit support

find /sys/fs/cgroup is 1K+ lines so posted at the gist below https://gist.github.com/jonathanmhamilton/1d6cdf33d10961b86f40

discordianfish commented 9 years ago

Ok, that looks like I would expect it. Can you paste me the content of those files?

/sys/fs/cgroup/cpuacct/docker/71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818/cpuacct.stat
/sys/fs/cgroup/cpuacct/docker/71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818/cpuacct.usage_percpu
/sys/fs/cgroup/cpuacct/docker/71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818/cpuacct.usage 

Or any other container should do it.

skelethan commented 9 years ago
root@vagrant-ubuntu-trusty-64:~# cat /sys/fs/cgroup/cpuacct/docker/71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818/cpuacct.stat
user 254
system 146
root@vagrant-ubuntu-trusty-64:~# cat /sys/fs/cgroup/cpuacct/docker/71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818/cpuacct.usage_percpu
5690512537
root@vagrant-ubuntu-trusty-64:~# cat /sys/fs/cgroup/cpuacct/docker/71d166899660fc2ec7d48ae444fb8f7a4d4496fabe9708ca88ee87b577c13818/cpuacct.usage
5695649837
discordianfish commented 9 years ago

@jonathanmhamilton Strange.. And you're sure that the container-exporter isn't printing any errors/warnings?

skelethan commented 9 years ago

Indeed strange! Don't see any error messages in container-exporter logs or Prometheus. Is there somewhere else that I should look?

root@vagrant-ubuntu-trusty-64:~# docker logs pce
+ exec app ''
2015/02/26 14:32:27 Starting Server: :9104

image

ashishjain14 commented 9 years ago

thanks a lot for fixing this up..it works fine on a ubuntu VM

dwitzig commented 9 years ago

still the same for me on CoreOS, have not tested Ubuntu yet. The same on Ubuntu, all container stats are 0 will check cgroup path later today

dwitzig commented 9 years ago

on CoreOS path is like this

/sys/fs/cgroup/cpuacct/system.slice/docker-094c63915d3d594727ecc4061b55451784141c8f7101ad1db5f6e3c29c121c3a.scope/cpuacct.usage

I can get results from both host or container

cat /sys/fs/cgroup/cpuacct/system.slice/docker-094c63915d3d594727ecc4061b55451784141c8f7101ad1db5f6e3c29c121c3a.scope/cpuacct.usage
242753876

docker exec evil_poitras cat /cgroup/cpuacct/system.slice/docker-094c63915d3d594727ecc4061b55451784141c8f7101ad1db5f6e3c29c121c3a.scope/cpuacct.usage
241811678
jimlar commented 9 years ago

If it helps; I have the same issue in both CoreOS (607) and Fedora (21). My cgroup paths look just like the ones dwitzig have, for both OSes. (both also using systemd). container-exporter only printing "Starting Server: :9104"

discordianfish commented 9 years ago

Hi everyone,

I'm currently working on a PR to add prometheus support to cadvisor. That would provide even more metrics and I can leave it to google to keep up with libcontainer changes :) Would that work for you all as well? Then I will focus on getting that PR merged. If you can compile it yourself, you can give it a try if you want: https://github.com/google/cadvisor/pull/545

skelethan commented 9 years ago

Works for me! Agreed on the benefit of having more metrics from cAdvisor, long term google support & possibility of cAdvisor supporting other container types. Nice work pushing this forward!

I will try to compile & run next week...

jimlar commented 9 years ago

Sounds like a good plan to me!

jimlar commented 9 years ago

FWIW I managed to build your cAdvistor PR and I'm getting stats in Fedora. Nice work!

jussil commented 9 years ago

As stated before, CoreOS has different path in groups for the information. Right now container-exporter defaults to "docker" which is true in ubuntu, however in coreos this is "system.slice". I got it to work by just defining command "-parent system.slice" when running the container. Stats are showing now just fine.

(this might not be the right place to tell this, but atleast this is the first issue I found when I were investigating why container-exporter reports 0 for all metrics)

discordianfish commented 9 years ago

Just FYI, I won't have time to fix centos anytime soon but happy to take a PR if someone else wants. On the long term though, I rather focus on getting full prometheus support in cadvisor. If you want to chime in: https://github.com/google/cadvisor/issues/688

mrosic commented 9 years ago

@jussil: Thank you very much, that fixed it for me too. Just to clarify for anybody who stumbles upon this: You need to add "-parent system.slice" to the arguments you give to the promdash executable. For example like this: docker run -d -p 127.0.0.1:9104:9104 -v /sys/fs/cgroup:/cgroup -v /var/run/docker.sock:/var/run/docker.sock --name prom_container_exporter prom/container-exporter -telemetry.endpoint="/container-exporter" -parent system.slice