dcgm-exporter crashes while getting device cpu affinity

I want to export metrics from dedicated, external VM, using dcgm-exporter in k8s. When I apply the latest Helm chart, errors are logged:

time="2021-05-08T13:03:29Z" level=info msg="Starting dcgm-exporter"
time="2021-05-08T13:03:29Z" level=info msg="Attemping to connect to remote hostengine at my-gpu-vm:5555"
time="2021-05-08T13:03:29Z" level=info msg="DCGM successfully initialized!"
time="2021-05-08T13:03:29Z" level=info msg="Collecting DCP Metrics"
time="2021-05-08T13:03:29Z" level=fatal msg="Error getting device cpu affinity: open /sys/bus/pci/devices/0000:8b:00.0/local_cpulist: no such file or directory"

so the pods are in CrashLoopBackOff state.

Here's how my custom values yaml file look:

arguments:
  - "-f"
  - "/etc/dcgm-exporter/dcp-metrics-included.csv"
  - "-r"
  - "my-gpu-vm:5555"

My k8s cluster runs on nodes without GPU. I think that instead of trying to get some information from the remote (non-k8s) VM, dcgm-exporter pod tries to get devices from k8s node on which that pod is running.

For now, I suggest that k8s version of the dcgm-exporter should be runned only on the k8s cluster with GPU. Is my suggestion correct?

NVIDIA / gpu-monitoring-tools

dcgm-exporter crashes while getting device cpu affinity #184