Closed udomsak closed 9 years ago
i met the same problem,do you create the "cadvisor" manually
if this: curl --noproxy localhost -X POST -d '[{"name":"foo","columns":["val"],"points":[[23]]}]' 'http://localhost:8086/db/mydb/series?u=root&p=root' works maybe this problem comes from proxy.
I have the same issue. Event hosting an influxdb on the same host results in the same error. I am not able to run the /storage/influxdb tests as I cant build the project
can't load package: package code.google.com/p/go.exp/inotify
@andrewwebber are you using godep
for the build?
no, i presume i should be?... i just do a go get
Yes, we moved to use godep for building and testing so you'll need to:
godep go build github.com/google/cadvisor
i noticed within i notify only _linux go files. does this mean i cant build on my mac?
I assumed running the influxdb storage tests to be no linux related.
Hmmm that may be true as inotify is Linux only. cAdvisor depends on that functionality so it probably is failing to build for the test.
We really need to beef up the documentation around building cAdvisor. I'll try to get a page together today.
the fact is i believe it is important to test the version of the influxdb go dependency with the latest version of influxdb. This is what i am trying to achieve. I will create a go program with the latests version of the client library for influxdb and try to execute a post within a docker container on a CoreOS machine. This would validate if the library works. Then do the same test with the version of the library specified in godep. This would help investigate if it is a library version issue or a networking issue
... maybe the solution for my post?
I'm new to both cadvisor and influxdb; also having this problem. On a Mac, I run google/cadvisor:0.5.0 in a docker container - Docker version 1.3.1, build 4e9bbfa
$docker logs
This works fine: curl --noproxy localhost -X POST -d '[{"name":"foo","columns":["val"],"points":[[123]]}]' 'http://localhost:8086/db/cadvisor/series?u=root&p=root' I can view metrics in the InfluxDB GUI via 'select * from foo' ,etc. @xiangflytang - What proxy are you reffering to?
@jwalczyk do you see this consistently? As in, every 1s or 60s, or every now and then? Do you see the data for the time period on or before 1107 09:09:48.189645
? It may just have dropped that request.
This is consistently because there is never any data logged to influxdb.
my work around is now to mine cadvisor like heapster but implement a new sink in heapster to log to logstash
Is InfluxDB running on your host outside of docker? Since cadvisor is running in a separate network namespace, it will not be able to connect to the applications running on the host network namespace. You can solve this by passing '--net=host' option as part of docker run.
docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --detach=true --name=cadvisor --net=host google/cadvisor:latest -storage_driver=influxdb --logtostderr
Thank you for the comments! I got it working. Silly mistake. @vishh got it and I had to change localhost to my host IP during cadvisor container startup. --storage_driver_host="my_host_ip:8086" I then tested this by running nc -kl 8086 on my host and saw all the nice data dumped. Thanks!
Can confirm i am now seeing data, however this is all running in one host (boot2docker)
@andrewwebber: Where are you running InfluxDB? Chatting over IRC (#google-containers) might be faster that going back and forth here on github.
@vishh Thanks for your support here. I have a kubernetes coreos cluster and got this working with the following fleet systemd unit files.
Unfortunatly the documentation for grafana did not work for me at is insisted in looking for a metadata endpoint for influxdb and elastic search. After deleting these and manually loading the kubernetes dashboard, everything worked (after editing a couple of graphs that were looking for a 'machines' time series where only 'stats' existed).
grafana (manually)
docker run -i -t --rm -p 80:80 -e INFLUXDB_HOST=192.168.89.161 -e INFLUXDB_PORT=8086 -e INFLUXDB_NAME=k8s -e INFLUXDB_USER=root -e INFLUXDB_PASS=root tutum/grafana
cadvisor (globally deployed)
[Unit]
Description=cAdvisor Service
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=10m
Restart=always
ExecStartPre=-/usr/bin/docker kill cadvisor
ExecStartPre=-/usr/bin/docker rm -f cadvisor
ExecStartPre=/usr/bin/docker pull google/cadvisor
ExecStart=/usr/bin/docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=4194:4194 --name=cadvisor --net=host google/cadvisor:latest --logtostderr --port=4194
ExecStop=/usr/bin/docker stop -t 2 cadvisor
[X-Fleet]
Global=true
MachineMetadata=role=kubernetes
Influxdb
[Unit]
Description=InfluxDB Service
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=10m
Restart=always
ExecStartPre=-/usr/bin/docker kill influxdb
ExecStartPre=-/usr/bin/docker rm -f influxdb
ExecStartPre=/usr/bin/docker pull kubernetes/heapster_influxdb
ExecStart=/usr/bin/docker run --name influxdb -p 8083:8083 -p 8086:8086 -p 8090:8090 -p 8099:8099 kubernetes/heapster_influxdb
ExecStop=/usr/bin/docker stop -t 2 influxdb
Heapster agent (buddy)
[Unit]
Description=Heapster Agent Service
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=10m
Restart=always
ExecStartPre=-/usr/bin/mkdir -p /home/core/heapster
ExecStartPre=-/usr/bin/docker kill heapster-agent
ExecStartPre=-/usr/bin/docker rm -f heapster-agent
ExecStartPre=/usr/bin/docker pull vish/heapster-buddy-coreos
ExecStart=/usr/bin/docker run --name heapster-agent --net host -v /home/core/heapster:/var/run/heapster vish/heapster-buddy-coreos
ExecStop=/usr/bin/docker stop -t 2 heapster-agent
[X-Fleet]
MachineOf=influxdb.service
Heapster
[Unit]
Description=Heapster Agent Service
After=docker.service
After=heapster-agent.service
Requires=docker.service
Requires=heapster-agent.service
[Service]
TimeoutStartSec=10m
Restart=always
ExecStartPre=-/usr/bin/docker kill heapster
ExecStartPre=-/usr/bin/docker rm -f heapster
ExecStartPre=/usr/bin/docker pull vish/heapster
ExecStart=/usr/bin/docker run --name heapster --net host -e INFLUXDB_HOST=127.0.0.1:8086 -v /home/core/heapster:/var/run/heapster vish/heapster
ExecStop=/usr/bin/docker stop -t 2 heapster
[X-Fleet]
MachineOf=heapster-agent.service
At the moment i'm lazy and dont care if my heapster agents move around the cluster.
Awesome. Thanks for the write up. Splitting up grafana into a separate Pod is something I have been meaning to do, but there is no native support for external IPs in Kubernetes yet sadly. Your idea of manually configuring grafana sounds good for the short term.
Recent versions of heapster exports a new table 'machine' which contains all the root cgroup stats. So the grafana dashboard should work you as-is with the latest version.
If you are running heapster in Kubernetes, you don't have to run the heapster-buddy container, unless you run kubernetes only on a subset of machines in your CoreOS cluster.
On Fri, Nov 7, 2014 at 5:23 PM, andrew notifications@github.com wrote:
@vishh https://github.com/vishh Thanks for your support here. I have a kubernetes coreos cluster and got this working with the following fleet systemd unit files
Unfortunatly the documentation for grafana did not work for me at is insisted in looking for a metadata endpoint for influxdb https://github.com/GoogleCloudPlatform/heapster/blob/master/influx-grafana/grafana/set_influx_db.sh#L11 and elastic search https://github.com/GoogleCloudPlatform/heapster/blob/master/influx-grafana/grafana/set_elasticsearch.sh#L14. After deleting these and manually loading the kubernetes dashboard, everything worked (after editing a couple of graphs that were looking for a 'machines' time series where only 'stats' existed).
grafana (manually)
docker run -i -t --rm -p 80:80 -e INFLUXDB_HOST=192.168.89.161 -e INFLUXDB_PORT=8086 -e INFLUXDB_NAME=k8s -e INFLUXDB_USER=root -e INFLUXDB_PASS=root tutum/grafana
cadvisor (globally deployed)
[Unit] Description=cAdvisor Service After=docker.service Requires=docker.service
[Service] TimeoutStartSec=10m Restart=always ExecStartPre=-/usr/bin/docker kill cadvisor ExecStartPre=-/usr/bin/docker rm -f cadvisor ExecStartPre=/usr/bin/docker pull google/cadvisor ExecStart=/usr/bin/docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=4194:4194 --name=cadvisor --net=host google/cadvisor:latest --logtostderr --port=4194 ExecStop=/usr/bin/docker stop -t 2 cadvisor
[X-Fleet] Global=true MachineMetadata=role=kubernetes
Influxdb
[Unit] Description=InfluxDB Service After=docker.service Requires=docker.service
[Service] TimeoutStartSec=10m Restart=always ExecStartPre=-/usr/bin/docker kill influxdb ExecStartPre=-/usr/bin/docker rm -f influxdb ExecStartPre=/usr/bin/docker pull kubernetes/heapster_influxdb ExecStart=/usr/bin/docker run --name influxdb -p 8083:8083 -p 8086:8086 -p 8090:8090 -p 8099:8099 kubernetes/heapster_influxdb ExecStop=/usr/bin/docker stop -t 2 influxdb
Heapster agent (buddy)
[Unit] Description=Heapster Agent Service After=docker.service Requires=docker.service
[Service] TimeoutStartSec=10m Restart=always ExecStartPre=-/usr/bin/mkdir -p /home/core/heapster ExecStartPre=-/usr/bin/docker kill heapster-agent ExecStartPre=-/usr/bin/docker rm -f heapster-agent ExecStartPre=/usr/bin/docker pull vish/heapster-buddy-coreos ExecStart=/usr/bin/docker run --name heapster-agent --net host -v /home/core/heapster:/var/run/heapster vish/heapster-buddy-coreos ExecStop=/usr/bin/docker stop -t 2 heapster-agent
[X-Fleet] MachineOf=influxdb.service
Heapster
[Unit] Description=Heapster Agent Service After=docker.service After=heapster-agent.service Requires=docker.service Requires=heapster-agent.service
[Service] TimeoutStartSec=10m Restart=always ExecStartPre=-/usr/bin/docker kill heapster ExecStartPre=-/usr/bin/docker rm -f heapster ExecStartPre=/usr/bin/docker pull vish/heapster ExecStart=/usr/bin/docker run --name heapster --net host -e INFLUXDB_HOST=127.0.0.1:8086 -v /home/core/heapster:/var/run/heapster vish/heapster ExecStop=/usr/bin/docker stop -t 2 heapster
[X-Fleet] MachineOf=heapster-agent.service
At the moment i'm lazy and dont care if my heapster agents move around the cluster.
— Reply to this email directly or view it on GitHub https://github.com/google/cadvisor/issues/271#issuecomment-62239188.
@andrewwebber: Heapster Agent and Heapster Service getting restarted abruptly when I followed above Units using Fleet. I am running all those Units in one CoreOS instance which is configured as cluster in Kubernetes. Could you explain whether we need different CoreOS instances or all Units needs to run on one CoreOS instance.
Thanks in advance....
@andrewwebber: BTW, I am running Kubernetes cluster with CoreOS on EC2 instance...
@MaheshRudrachar: Can you try using 'kubernetes/heapster' image instead of 'vish/heapster' ?
`[Unit] Description=Heapster Agent Service After=docker.service After=heapster-agent.service Requires=docker.service Requires=heapster-agent.service
[Service] TimeoutStartSec=10m Restart=always ExecStartPre=-/usr/bin/docker kill heapster ExecStartPre=-/usr/bin/docker rm -f heapster ExecStartPre=/usr/bin/docker pull vish/heapster ExecStart=/usr/bin/docker run --name heapster --net host -e INFLUXDB_HOST=127.0.0.1:8086 -v /home/core/heapster:/var/run/heapster kubernetes/heapster ExecStop=/usr/bin/docker stop -t 2 heapster
[X-Fleet] MachineOf=heapster-agent.service`
With respect to the discovery issue of the influxdb database for grafana i am think about the following approaches.
Options 1:
Option 2:
Option 3:
@MaheshRudrachar I have not run specifically into any issues you mentioned. However indirect issues probably due to the fact that i am low on hardware in my lab environment. I have influxdb, the heapster agents and grafana all running on my single etcd serving my cluster which also runs my private docker registry container :-).
Ultimately it makes sense to split these machines up into dedicated roles to better independently isolate from where issues might be originating. For example my etcd server suddenly was unable to start docker. In reality I shouldn't care because a production setup of an etcd cluster probably should not even be running docker containers.
I am also running the alpha branch with automatic update (reboot when update found) which doesn't always help. So I guess my tip would be to move out at least your units of the etcd servers
Also in my case i dont really care if my cadvisors crash or even if my influxdb, heapsters gets rebuild and run. Sacred are the etcd servers, if they go down your whole kubernetes cluster goes down too and you need to redeploy all of your pods, replication controllers and services.
@andrewwebber: I am exploring using a proxy as part of grafana to get to InfluxDB and ElasticSearch containers. This will help split up influxdb, elastic search and grafana. Since you mentioned that you run multiple containers on your etcd server, you could try placing resource limits on all the containers, or at least InfluxDB.
@andrewwebber & @vishh
Thanks for your inputs. Still not able to resolve this issue.
Here is my Kubernetes Setup Details:
I have setup 5 CoreOS instances on AWS and followed kelseyhightower kubernetes-fleet-tutorial. Basically I have 1 ETCD dedicated Server, 3 Minions with 1 Minion acting as API Server and 1 dedicated Minion for setting up Heapster. All Minions are pointing to ETCD dedicated server.
Now when I ran units which are mentioned above:
Need your help in resolving this. Thanks
@MaheshRudrachar @vishh This is due to the fact that the heapster buddy assumes fleet is running on the host. I had this issue and therefore had to run my heapster agents on the etcd node.
https://github.com/GoogleCloudPlatform/heapster/blob/master/clusters/coreos/buddy.go#L35
I believe we need to add a flag to the buddy to parameterise the fleet server url
Adding a flag to the buddy sounds good. I opened https://github.com/GoogleCloudPlatform/heapster/issues/11. Lets continue the discussion there.
I made a small change to the grafana container via https://github.com/GoogleCloudPlatform/heapster/pull/10. This should make the kubernetes version of heapster work outside of GCE. Give it a try and let me know if you face any issues.
On Thu, Nov 13, 2014 at 4:35 AM, andrew notifications@github.com wrote:
@MaheshRudrachar https://github.com/MaheshRudrachar @vishh https://github.com/vishh This is due to the fact that the heapster buddy assumes fleet is running on the host https://github.com/GoogleCloudPlatform/heapster/blob/master/clusters/coreos/buddy.go#L35. I had this issue and therefore had to run my heapster agents on the etcd node.
https://github.com/GoogleCloudPlatform/heapster/blob/master/clusters/coreos/buddy.go#L35
I believe we need to add a flag to the buddy to parameterise the fleet server url
— Reply to this email directly or view it on GitHub https://github.com/google/cadvisor/issues/271#issuecomment-62883967.
Thanks @vishh and @andrewwebber. I will give a try with latest version.
@vmarmol to avoid confusion about building your project, you can use godep -r
to rewrite your imports. It makes it so that your godep project can still be "go get-able"
We've been considering that and it does feel simpler (we'd also not need to use godep
for build or test).
This issue seems over and taken over by other things :) closing. Feel free to open other issues if you run into anything.
I found issue cAdvisor 0.4.1 on CoreOS when using with Storage backend like Influxdb it can't send the data to Influxdb
Influxdb is working fine
My running environment
The error log said