Closed cookandy closed 8 years ago
to be honest, we use mostly /metrics/snapshot
endpoint for metrics and collectd
to collect them:
like https://github.com/rayrod2030/collectd-mesos
and app-container metrics (like CPU/memory etc.) we take from docker
like https://github.com/bogus-py/docker-collectd-plugin
/monitor/statistics
seems to return nothing []
or very truncated information
like one app instead of ten running:
[
{
"executor_id": "app.8e2df3af-9c58-11e6-b4ed-0242289818c8",
"executor_name": "Command Executor (Task: app.8e2df3af-9c58-11e6-b4ed-0242289818c8) (Command: NO EXECUTABLE)",
"framework_id": "b2587f4a-53f7-40e0-a565-a89ec175a650-0000",
"source": "app.8e2df3af-9c58-11e6-b4ed-0242289818c8",
"statistics": {
"cpus_limit": 0.2,
"cpus_system_time_secs": 132569.84,
"cpus_user_time_secs": 76519.61,
"mem_limit_bytes": 570425344,
"mem_rss_bytes": 323592192,
"timestamp": 1477812340.71858
}
}
]
I am seeing things like this in the Mesos agent logs:
Failed to get resource statistics for executor 'sysdig-agent.1951556a-9ad9-11e6-a24a-0242aec327c9' of framework efaaca88-e937-4288-b929-2a0bd940e70a-0000: Failed to collect cgroup stats: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/2653/cgroup: Failed to open file: No such file or directory
Is it because /proc
is mapped to /host/proc
?
/proc:/host/proc:ro
Why is it done this way? Can we tell Mesos agent to use /host/proc
and /host/sys
?
Is it because
/proc
is mapped to/host/proc
?
I don't think so. It seems like mesos issue. Inside container it is not allowed to bind /proc:/proc
one to one since PIDs inside container are not the same as in the host. Which means, mesos should look for /host/proc
first, if not exists then /proc
(or should do autodetect if it's running inside container).
Are you seeing similar errors in your agent logs? I am not sure if this appears with older versions. I also can't seem to find an agent option parameter to configure this..
Yes I see them too when I do request for /monitor/statistics
(it is visible for old and new versions )
Looks like this can be resolved by running the container with the --pid=host
option. Hope this helps someone.
I will add it as default option then.
It should not have impact for the rest of the system.
this should do it: https://github.com/eBayClassifiedsGroup/PanteraS/pull/230
heh you were faster :) and merged
Any ideas why the
/monitor/statistics
endpoint returns empty[]
? I briefly looked at Mesos issues but didn't find anything. Wondering if you're experiencing the same...