ROCm / ROCK-Kernel-Driver

AMDGPU Driver with KFD used by the ROCm project. Also contains the current Linux Kernel that matches this base driver
Other
323 stars 99 forks source link

module build failure with 5.11.0 #116

Closed zhol01825 closed 1 month ago

zhol01825 commented 2 years ago

Hi all, I am trying to build fxkamd/criu-wip branch to try out criu amd plugin, while dkms amdgpu module build failed with following, anyone has any idea?

Building module: cleaning build area...(bad exit status: 2) make -j256 KERNELRELEASE=5.11.0 -j256 TTM_NAME=amdttm SCHED_NAME=amd-sched -C /lib/modules/5.11.0/build M=/var/lib/dkms/amdgpu/4.3-59/build......................................................(bad exit status: 2) ERROR (dkms apport): kernel package linux-headers-5.11.0 is not supported Error! Bad return status for module build on kernel: 5.11.0 (x86_64) Consult /var/lib/dkms/amdgpu/4.3-59/build/make.log for more information.

make.log (file attached): DKMS make.log for amdgpu-4.3-59 for kernel 5.11.0 (x86_64) Wed Oct 27 03:16:16 AM PDT 2021 /var/lib/dkms/amdgpu/4.3-59/build/Makefile:26: "Local GCC version 90303 does not match kernel compiler GCC version 90300" /var/lib/dkms/amdgpu/4.3-59/build/Makefile:27: "This may cause unexpected and hard-to-isolate compiler-related issues" CC [M] /var/lib/dkms/amdgpu/4.3-59/build/scheduler/sched_main.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/scheduler/sched_fence.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/scheduler/sched_entity.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_tt.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_bo.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_bo_util.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_bo_vm.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_module.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_execbuf_util.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_range_manager.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/amd/amdkcl/main.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_resource.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/ttm/ttm_pool.o CC [M] /var/lib/dkms/amdgpu/4.3-59/build/amd/amdkcl/symbols.o In file included from /var/lib/dkms/amdgpu/4.3-59/build/scheduler/backport/backport.h:5, from : ./include/generated/uapi/linux/version.h:6: warning: "DRM_VERSION_CODE" redefined 6 #define DRM_VERSION_CODE 330496
In file included from /var/lib/dkms/amdgpu/4.3-59/build/scheduler/backport/backport.h:5, from : ./include/generated/uapi/linux/version.h:6: warning: "DRM_VERSION_CODE" redefined 6 #define DRM_VERSION_CODE 330496

In file included from /var/lib/dkms/amdgpu/4.3-59/build/scheduler/backport/backport.h:5, from : ./include/generated/uapi/linux/version.h:6: warning: "DRM_VERSION_CODE" redefined 6 | #define DRM_VERSION_CODE 330496

...

In file included from /var/lib/dkms/amdgpu/4.3-59/build/amd/backport/include/kcl/kcl_amdgpu_drm_fb_helper.h:35, from /var/lib/dkms/amdgpu/4.3-59/build/amd/backport/backport.h:80, from : /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/amdgpu.h:1231: note: this is the location of the previous definition 1231 #define REG_SET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK)
In file included from /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/../display/dmub/src/dmub_dcn21.c:27: /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/../display/dmub/src/dmub_reg.h:112: warning: "REG_GET" redefined 112 #define REG_GET(reg_name, field, val) \
In file included from /var/lib/dkms/amdgpu/4.3-59/build/amd/backport/include/kcl/kcl_amdgpu_drm_fb_helper.h:35, from /var/lib/dkms/amdgpu/4.3-59/build/amd/backport/backport.h:80, from : /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/amdgpu.h:1232: note: this is the location of the previous definition 1232 #define REG_GET(FIELD, v) (((v) << FIELD##_SHIFT) & FIELD##_MASK)

/var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/../display/dc/core/dc_debug.c: In function ‘dc_status_to_str’: /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/../display/dc/core/dc_debug.c:378:2: warning: enumeration value ‘DC_FAIL_DSC_VALIDATE’ not handled in switch [-Wswitch] 378 | switch (status) { | ^~ /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/../display/dc/core/dc_debug.c:378:2: warning: enumeration value ‘DC_NO_DSC_RESOURCE’ not handled in switch [-Wswitch] LD [M] /var/lib/dkms/amdgpu/4.3-59/build/amd/amdgpu/amdgpu.o grep: arch/x86/boot/amd/dkms/config/config.h: No such file or directory /var/lib/dkms/amdgpu/4.3-59/build/Makefile:16: dma_resv->seq is missing., exit.... Stop. make[2]: [Makefile:1710: modules] Error 2

Ubuntu 20.04.3 LTS (Focal Fossa) Current kernel: 5.11.0-051100-generic

make.log

zhol01825 commented 2 years ago

@rajbhar any chance you have some ideas?

rajbhar commented 2 years ago

This is monolithic kernel branch. We are currently working on updating our APIs to recent kernel versions.

da-phil commented 2 years ago

@zhol01825 Have you tried kernel 5.11.0-42-generic on your system yet?

I've been successfully using this kernel version together with the amdgpu 21.40.1 driver (installed from .deb packages and then via sudo amdgpu-install --opencl=rocr --vulkan=pro).

malixian commented 2 years ago

@zhol01825 Have you resolve this problem? I also want to build fxkamd/criu-wip branch but don't know what is the correct compile command.

rajbhar commented 2 years ago

All CRIU APIs are now available upstream. Feel free to use https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next or any latest upstream kernel branch such as https://elixir.bootlin.com/linux/v5.18-rc4/source/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c#L2524

malixian commented 2 years ago

All CRIU APIs are now available upstream. Feel free to use https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next or any latest upstream kernel branch such as https://elixir.bootlin.com/linux/v5.18-rc4/source/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c#L2524

So I need to install rocm 5.18 to enjoy?

malixian commented 2 years ago

Is there some compile option available to compile the kernel module? I want to use CRIU in rocm-4.3(rock-4.3),if possible.

ppanchad-amd commented 1 month ago

@zhol01825 Apologies for the lack of response. Can you please test with the latest ROCm 6.2? If issue is resolved, please close the ticket. Thanks!

rajbhar commented 1 month ago

CRIU support had been upstreamed a while ago. Please refer to the upstream version of the driver available in Linux mainline or amd-staging-drm-next branch. Easier way is to just use ROCm6.2 like mentioned above.