Closed samek closed 7 years ago
Today we have a concept of "namespaces" for container names and those come with a list of aliases used to address the container. Docker containers are under the "docker" namespace and its aliases are the Docker ID and name. This feature may be served by the "marathon" namespace and have the app ID as one of the aliases. This we will then push to the InfluxDB backend.
We probably don't have the bandwidth to tackle this anytime soon, but we will gladly take PRs towards that goal if you'd like :)
If we did a generic change of exporting aliases to storage backend, wouldn't it resolve this problem too?
On Fri, Feb 27, 2015 at 8:51 AM, Victor Marmol notifications@github.com wrote:
Today we have a concept of "namespaces" for container names and those come with a list of aliases used to address the container. Docker containers are under the "docker" namespace and its aliases are the Docker ID and name. This feature may be served by the "marathon" namespace and have the app ID as one of the aliases. This we will then push to the InfluxDB backend.
We probably don't have the bandwidth to tackle this anytime soon, but we will gladly take PRs towards that goal if you'd like :)
— Reply to this email directly or view it on GitHub https://github.com/google/cadvisor/issues/546#issuecomment-76427469.
Yes, you need to do both (I think we export the aliases today, if not we should). Here the aliases they're referring to are not a Docker container name or ID, is that correct @samek?
I see, the request is for exposing an env variable. This seems better handle in heapster aggregation where we can add arbitrary tags, or by exposing custom metrics hooks in cAdvisor.
@samek We do want to custom hooks to add extra columns, but we'll probably not get to it in near future.
@vmarmol yes I'm not 100% sure but aliases are exported. Problem is that there's no link to the marathon task id/app which is different. And If you want to autoscale depending on the app which is run by marathon you would also need at least MARATHON_APP_ID to go with it.
Are you guys at all parsing json from the /var/lib/docker/containers/XXXXX/config.json ?
Going to join the thread, having the same issue when trying to use cadvisor in combination with Marathon and Mesos.
+1
@samek can you pls share your Marathon app spec (JSON) file for cAdvisor?
@mhausenblas We're not using cadvisor anymore :(
For mesos monitoring and task monitoring from marathon we use https://github.com/bobrik/collectd-docker https://github.com/bobrik/docker-collectd-mesos
It solved all our problems.
If you need those I can post them for sure.
Awesome, thanks @samek — yes, the Marathon app spec would be appreciated!
@samek any chance?
@scalp42 I've sent it directly to @mhausenblas since it's not related to cadvisor at all. But sure, I'll just resend the mail to you.
@samek the marathon app spec could be put in a gist on github somewhere and it would be appreciated :)
@salimane It feals wrong that I'm posting solution in cadvisor page since It doesn't use it. anyway.
So in order to use it: On each mesos-slave run docker run -d -e GRAPHITE_HOST=IP_OF_GRAPHITE_HOST -e COLLECTD_HOST=IP_OF_MESOS_SLAVE_WITH_UNDERSCORES bobrik/collectd-docker
For example If my graphite host is 10.0.0.251 and the mesos slave ip is 10.0.0.193 you would run:
docker run -d -v /var/run/docker.sock:/var/run/docker.sock -e GRAPHITE_HOST=10.0.0.251 -e COLLECTD_HOST=10_0_0_193 bobrik/collectd-docker
(I suggest that you run the docker with restart=always)
Then when defining app in marathon you have to add couple of env vars which are picked by that docker. You have to add COLLECTD_DOCKER_APP, COLLECT_DOCKER_TASK_ENV and COLLECTD_DOCKER_TASK_ENV_TRIM_PREFIX.
for example one of our api project looks like this:
{
"container": {
"type": "DOCKER",
"docker": {
"image": "10.0.0.48:5000/spored-api:v6",
"network": "BRIDGE",
"portMappings": [
{ "containerPort": 80, "hostPort": 0, "servicePort": 8885, "protocol": "tcp" }
]
},
"volumes": [
{
"containerPath": "/var/log/nginx",
"hostPath": "/var/log/dockerlogs/nginx",
"mode": "RW"
}
]
},
"id": "spored-api",
"cpus": 0.5,
"mem": 500,
"env": {"COLLECTD_DOCKER_APP":"spored-api", "COLLECTD_DOCKER_TASK_ENV":"MESOS_TASK_ID", "COLLECTD_DOCKER_TASK_ENV_TRIM_PREFIX":"spored-api"},
"constraints": [
["env", "CLUSTER", "live"]
],
"upgradeStrategy": {
"minimumHealthCapacity": 0.5,
"maximumOverCapacity": 0.8
},
"healthChecks": [
{
"protocol": "HTTP",
"portIndex": 0,
"path": "/",
"gracePeriodSeconds": 60,
"intervalSeconds": 20,
"maxConsecutiveFailures": 6
}
]
}
Now When you go to grafana (load the template which is available on githhub) and you should be able to pick stats by app.
@samek thanks :+1:
Any update? Is the feature support now?
I dont think anyone has any plans to address this. Anyone who has interest in this can feel free to propose and implement a solution.
We lazily fixed it in a fork by abusing the 'exposedenv' system, as I was not going to fix the difference between outputs getting either ContainerReference or ContainerInfo objects. This is most surely not the way to do it properly, hence no merge request, but it might help some folks.
diff --git a/container/docker/handler.go b/container/docker/handler.go
index dd0a2cd..11bcf91 100644
--- a/container/docker/handler.go
+++ b/container/docker/handler.go
@@ -257,6 +257,14 @@ func newDockerContainerHandler(
if len(splits) == 2 && splits[0] == exposedEnv {
handler.envs[strings.ToLower(exposedEnv)] = splits[1]
}
+ // Add exposed environments as labels to enable them in all outputs
+ // Ideally, the outputs would handle it themselves, however, the
+ // difference between a propagated ContainerReference or ContainerInfo
+ // is harder to fix, and this is easier for now. This means the outputs
+ // do not differentiate between labels and exposed environmental vars,
+ // and environmental vars with the same name can possibly overwrite
+ // labels: this can be seen as a feature.
+ handler.labels[strings.ToLower(exposedEnv)] = splits[1]
}
}
}
Hi, now that i have cadvisor up and running through marathon I'm getting data into the influxdb.
The problem that I'm facing is that I don't know which app is under docker name.
Let me try to explain: When marathon starts a docker it gives him a name for example
for cadvisor it's named mesos-3deefa59-6981-4069-8c74-911aead8b396
As you can see in the screenshot I have 2 nginx images also started and they have completely different names. In the influxdb I cannot differentiate/group by apps since they are presented as mesos name.
Now, Marathon passes some environment variables into the started dockers which could be used to group it.
"MARATHON_APP_VERSION=2015-02-27T13:38:50.135Z", "HOST=10.0.0.193", "MESOS_TASK_ID=nginx2.f726a4c5-be85-11e4-82c2-56847afe9799", "PORT=31005", "PORTS=31005", "PORT_80=31005", "MARATHON_APP_ID=/nginx2", "PORT0=31005", "MESOS_SANDBOX=/mnt/mesos/sandbox", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "NGINX_VERSION=1.7.10-1~wheezy"
those filelds are located in the docker dir config.json and should be accessible by cadvisor.
Question is would it be wise to also sent MARATHON_APP_ID into the influxdb in order to group stats by app ?
Or How would this be approached correctly ?