nickbabcock / OhmGraphite

Expose hardware sensor data to Graphite / InfluxDB / Prometheus / Postgres / Timescaledb
Other
426 stars 38 forks source link

Integrated GPU sensors displaces Discrete GPU sensors #451

Closed meandrii closed 1 month ago

meandrii commented 1 month ago

Having two GPUs Integrate + Discrete, cause software to pull tempurate from Integrated on the Discrete Graph

I disabled Integrated AMD GPU in Device Manager and looks like graphite started to pull correct data; take a look at the graph Orange arrow indicates the moment of disabling integrated graphics

I use latest version of software with default config, except Prometheus integration. My hardware is: CPU:7800X3D GPU:RX6950XT

Screenshot 2024-09-08 233417 Screenshot 2024-09-08 233424

Data from http://localhost:4445/metrics Both GPUs enabled:

# HELP ohm_gpuati_celsius Metric reported by open hardware sensor
# TYPE ohm_gpuati_celsius gauge
ohm_gpuati_celsius{hardware="AMD Radeon RX 6950 XT",sensor="GPU VR SoC",hw_instance="5"} 40
ohm_gpuati_celsius{hardware="AMD Radeon(TM) Graphics",sensor="GPU VR SoC",hw_instance="1"} 40

Only discrete GPU enabled:

# HELP ohm_gpuati_celsius Metric reported by open hardware sensor
# TYPE ohm_gpuati_celsius gauge
ohm_gpuati_celsius{hardware="AMD Radeon RX 6950 XT",sensor="GPU Hot Spot",hw_instance="10"} 56
ohm_gpuati_celsius{hardware="AMD Radeon RX 6950 XT",sensor="GPU Core",hw_instance="10"} 52
nickbabcock commented 1 month ago

Thanks for the bug report. Can you try out the latest nightly build of LibreHardwareMonitor (https://github.com/LibreHardwareMonitor/LibreHardwareMonitor?tab=readme-ov-file#where-can-i-download-it)

That'll let us know where the bug is coming from.

meandrii commented 1 month ago

New build is on the right, and looks good to me. Screenshot 2024-09-09 132833

nickbabcock commented 1 month ago

Nice, can you try out the latest build of ohmgraphite? (it's a zip file found here: https://github.com/nickbabcock/OhmGraphite/actions/runs/10783732414?pr=452 )

If it works, I'll cut a new version of OhmGraphite in the next couple days.

meandrii commented 1 month ago

I think there are a problem with powe nowr, freq and temperature for integrated not correct And i am not sure for Discrete Screenshots taken aprox. at the same moments Screenshot 2024-09-09 224935 Screenshot 2024-09-09 224955 Screenshot 2024-09-09 225008 Screenshot 2024-09-09 225132

nickbabcock commented 1 month ago

Ah this might be a dashboard problem. I think I know what the problem is, but if you want to run a quick test:

Update the wattage dashboard so the metric is includes the hardware instance:

ohm_gpuati_watts{instance="$instance",hardware="$gpu"}

Does that solve the problem? You'd have to do it for each graph.

This should make it so the GPU metrics aren't stepping on each others toes.

If this works, I'll update the dashboard templates.