Closed zhuyifei1999 closed 2 years ago
Amazing! I will try to take a look at it this weekend.
I successfully tested on a system I have access to with an AMD Radeon RX 6800 XT. I also fixed some issues I encountered while browsing the code.
For the fan speed, I saw that you added a comment. I did not find a way to get it through AMDGPU. lm_sensors seem to get the correct info though, so I will investigate where to retrieve it.
Edit: Never mind I found that for the fan https://dri.freedesktop.org/docs/drm/gpu/amdgpu.html
Awesome!
Yeah I didn't bother with checking for close(-1)
since it's practically a no-op, similar to free(NULL)
; though unlike free it has syscall overhead.
The reason I didn't do fan speed was that it seemed like it was slightly convoluted to retrieve. Rather than being a libdrm API call, I have to go through sysfs; doable but code can get messy and I can't test it.
Other than that, does the patch work on your APU? I am wondering, because in that case the fan is shared with the CPU and might not be registered with HWMON as such.
The APU does not expose fan information on the GPU's hwmon, AFAICT.
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /sys/bus/pci/devices/0000:08:00.0 $ lspci | grep VGA
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne (rev d1)
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /sys/bus/pci/devices/0000:08:00.0 $ ls
ari_enabled devspec hwmon max_link_speed msi_irqs pp_dpm_sclk rescan smartshift_apu_power
backlight dma_mask_bits i2c-0 max_link_width numa_node pp_dpm_socclk reset smartshift_dgpu_power
boot_vga driver i2c-1 mem_info_gtt_total pcie_replay_count pp_force_state reset_method subsystem
broken_parity_status driver_override i2c-12 mem_info_gtt_used power pp_mclk_od resource subsystem_device
class drm i2c-2 mem_info_preempt_used power_dpm_force_performance_level pp_num_states resource0 subsystem_vendor
config enable i2c-3 mem_info_vis_vram_total power_dpm_state pp_od_clk_voltage resource0_wc thermal_throttling_logging
consistent_dma_mask_bits firmware_node iommu mem_info_vis_vram_used power_state pp_power_profile_mode resource2 uevent
consumer:pci:0000:08:00.1 fw_version iommu_group mem_info_vram_total pp_cur_state pp_sclk_od resource2_wc vbios_version
current_link_speed gpu_busy_percent irq mem_info_vram_used pp_dpm_dcefclk pp_table resource4 vendor
current_link_width gpu_metrics link mem_info_vram_vendor pp_dpm_fclk product_name resource5
d3cold_allowed graphics local_cpulist modalias pp_dpm_mclk product_number revision
device hdcp_srm local_cpus msi_bus pp_dpm_pcie remove serial_number
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /sys/bus/pci/devices/0000:08:00.0 $ ls hwmon/
hwmon1
zhuyifei1999@zhuyifei1999-ThinkPad-P14s-Gen-2a /sys/bus/pci/devices/0000:08:00.0 $ ls hwmon/hwmon1/
device freq1_input freq1_label in0_input in0_label in1_input in1_label name power power1_average power1_label subsystem temp1_input temp1_label uevent
I think that we are almost good. I will update the README and check for the amdgpu_drm.h to conditionally compile the support for AMDGPU.
Nice!
I'm including amdgpu_drm.h directly because it's a kernel UAPI header, and I think any sane compiler toolchain on Linux would include the kernel UAPI headers. Though, up to you, could copy the declarations too since kernel-userspace ABI has guaranteed stability.
The problem with NVIDIA was that the nvml header was not packaged on all the distributions, and downloading a recent header could break the build. Also I was not certain I could redistribute a version of the nvidia header. That is why I included the definitions directly.
Things are different with libdrm. The header are consistently available on any distribution and updated with the kernel. Much cleaner than NVIDIA.
Hence, relying on libdrm for AMDGPU support is fine in my opinion!
If you are fine with the latest changes I will merge it into master.
The problem with NVIDIA was that the nvml header was not packaged on all the distributions, and downloading a recent header could break the build. Also I was not certain I could redistribute a version of the nvidia header. That is why I included the definitions directly.
I see.
Neat Videocard TOP
Nice naming lol
If you are fine with the latest changes I will merge it into master.
No objections. Go ahead. Thanks!
A picture is worth a thousand words:
This PR isn't fully ready for merge yet. I think I'd like some feedback and do some polishing first (hence the RFC). My AMDGPU is also an iGPU / APU which limits how much I can test myself.
Most the access is using libdrm and kernel APIs. <drm/amdgpu_drm.h> is a kernel UAPI header so I directly included it. For libdrm I'm using dlopen and re-declaring the constants to avoid a compile-time dependency, just like what the original code does with NVML.
I also did some code refactoring to make my life easier, hope that is okay.
I really cannot deal with typedef structsCC #106, i915 kernel changes doesn't seem to have merged into mainline yet.