Closed pascal71 closed 4 years ago
Would you submit a PR? What changes you did to the manifests?
Can you check if mounting the /dev from host to the pod fixes the problem?
Will also do that :)
Will report back this evening, ok?
On 7/14/20 3:06 PM, Carlos Eduardo wrote:
Can you check if mounting the /dev from host to the pod fixes the problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/carlosedp/cluster-monitoring/issues/73#issuecomment-658167804, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHNH3IDGPKQB2UMMR2IAA4LR3RJ5FANCNFSM4OPXNKUQ.
Good evening Carlos,
I am afraid for mount /dev in a container you will still need the 'privileged' security context. If you use that context, no need for mounting /dev as it's already working then.
E.g.:
containers: - command: - /bin/rpi_exporter - --web.listen-address=127.0.0.1:9243 image: carlosedp/arm_exporter:latest name: arm-exporter / securityContext:/*/ /*/ privileged: true/ resources: limits: cpu: 100m memory: 100Mi requests: cpu: 50m memory: 50Mi
Kind regards,
Pascal van Dam
On 7/14/20 3:06 PM, Carlos Eduardo wrote:
Can you check if mounting the /dev from host to the pod fixes the problem?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/carlosedp/cluster-monitoring/issues/73#issuecomment-658167804, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHNH3IDGPKQB2UMMR2IAA4LR3RJ5FANCNFSM4OPXNKUQ.
Yes, I think it's necessary since the utility that reads the temperature requires the device. I'd welcome a PR! Thanks
Good afternoon,
Thank your for your reply.
I will, post one this evening.
Small other question;
als have ARM64 / RPI8Gb cluster running; that DOES report CPU (with priv mode) but not GPU temp.
Any idea?
On 7/15/20 3:18 PM, Carlos Eduardo wrote:
Yes, I think it's necessary since the utility that reads the temperature requires the device. I'd welcome a PR! Thanks
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/carlosedp/cluster-monitoring/issues/73#issuecomment-658762221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHNH3IFJ7VI5ZKS3EOA2KYTR3WUDLANCNFSM4OPXNKUQ.
No idea, I don't have the newer ones. I believe the vcgencmd utiity used by rpi_exporter doesn't support Rpi4 new SOC. Maybe Lukas from the rpi_exporter utility can help. https://github.com/lukasmalkmus/rpi_exporter
Good afternoon Carlos,
The arm7hf (32bit) does work for both CPU and GPU on RPi4. On ARM64 only for CPU.
I will contact Lukas. :)
Many thanks for your support.
Kind regards,
Pascal van Dam
On 7/15/20 3:48 PM, Carlos Eduardo wrote:
No idea, I don't have the newer ones. I believe the vcgencmd utiity used by rpi_exporter doesn't support Rpi4 new SOC. Maybe Lukas from the rpi_exporter utility can help. https://github.com/lukasmalkmus/rpi_exporter
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/carlosedp/cluster-monitoring/issues/73#issuecomment-658778391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHNH3IGHQDUZUOOX4RY2DS3R3WXRJANCNFSM4OPXNKUQ.
Closing this as it's not a monitoring stack issue.
I will submit a PR for this the solution is to add:
container.mixin.securityContext.withPrivileged(true)
to
arm_exporter.jsonnet
Great, thanks for the find!
All arm-exporter PODs are up-and-running in the daemonset; however the logs show it is unable to access the GPU. Resulting log entries:
time="2020-07-03T11:21:44Z" level=error msg="gpu collector failed after 0.003312s: exit status 255" source="collector.go:142"
Doing a strace on the rpi_exporter binary shows that it tries to access /dev/vchiq. Which is not in /dev in the arm-exporter container of this POD.
Changing the securityContext to privileged for this POD (e.g. changing the arm-exporter-daemonset.yaml) fixes the problem.
Kind regards,
Pascal van Dam