amd / xdna-driver

Other
317 stars 41 forks source link

Building release failing on newer kernels "dkms autoinstall failed for amdgpu(10)" #289

Open lalilaloe opened 3 weeks ago

lalilaloe commented 3 weeks ago

I keep running into this dkms errors when trying to build XDNA driver release using the suggested 6.10 and above kernels.

Cleaning build area...(bad exit status: 2)
. /tmp/amd.PnxhEzM5/.env && make -j12 KERNELRELEASE=6.10.10-061010-generic TTM_NAME=amdttm SCHED_NAME=amd-sched -C /lib/modules/6.10.10-061010-generic/build M=/tmp/amd.PnxhEzM5...(bad exit status: 2)
ERROR (dkms apport): kernel package linux-headers-6.10.10-061010-generic is not supported
Error! Bad return status for module build on kernel: 6.10.10-061010-generic (x86_64)
Consult /var/lib/dkms/amdgpu/6.8.5-2041575.22.04/build/make.log for more information.

Trying to install the correct linux-headers for the kernel leads different errors on dkms amdgpu for example;

#tail /var/lib/dkms/amdgpu/6.8.5-2041575.22.04/build/make.log
make[3]: *** Waiting for unfinished jobs....
  LD [M]  /tmp/amd.Axsd1Ocq/ttm/amdttm.o
/tmp/amd.Axsd1Ocq/amd/amdkcl/kcl_device_cgroup.c:29:6: warning: no previous prototype for ‘amdkcl_dev_cgroup_init’ [-Wmissing-prototypes]
   29 | void amdkcl_dev_cgroup_init(void)
      |      ^~~~~~~~~~~~~~~~~~~~~~
make[2]: *** [scripts/Makefile.build:485: /tmp/amd.Axsd1Ocq/amd/amdgpu] Error 2
make[2]: *** [scripts/Makefile.build:485: /tmp/amd.Axsd1Ocq/amd/amdkcl] Error 2
make[1]: *** [/usr/src/linux-headers-6.11.0-061100-generic/Makefile:1932: /tmp/amd.Axsd1Ocq] Error 2
make: *** [Makefile:224: __sub-make] Error 2
make: Leaving directory '/usr/src/linux-headers-6.11.0-061100-generic'
#tail /var/lib/dkms/amdgpu/6.8.5-2041575.22.04/build/make.log
./include/trace/stages/stage6_event_callback.h:34:9: note: macro "__assign_str" defined here
   34 | #define __assign_str(dst)                                               \
      |         ^~~~~~~~~~~~
  CC [M]  /tmp/amd.THCBoFpG/amd/amdgpu/amdgpu_sync.o
make[3]: *** [scripts/Makefile.build:244: /tmp/amd.THCBoFpG/amd/amdgpu/amdgpu_trace_points.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [scripts/Makefile.build:485: /tmp/amd.THCBoFpG/amd/amdgpu] Error 2
make[1]: *** [/usr/src/linux-headers-6.10.10-061010-generic/Makefile:1940: /tmp/amd.THCBoFpG] Error 2
make: *** [Makefile:240: __sub-make] Error 2
make: Leaving directory '/usr/src/linux-headers-6.10.10-061010-generic'

I've tried the following kernels:

6.10.10-061010-generic: failed
6.10.5-061005-generic: failed
6.11.0-061100-generic: failed
6.8.0-47-generic: succeeded

While i did manage to build and install on 6.8.0-47-generic kernel.

#dkms status
amdgpu/6.8.5-2041575.22.04, 5.15.0-124-generic, x86_64: installed
amdgpu/6.8.5-2041575.22.04, 6.8.0-060800-generic, x86_64: installed
amdgpu/6.8.5-2041575.22.04, 6.8.0-47-generic, x86_64: installed
xrt-amdxdna/2.18.0, 6.10.10-061010-generic, x86_64: installed
xrt-amdxdna/2.18.0, 6.10.5-061005-generic, x86_64: installed
xrt-amdxdna/2.18.0, 6.11.0-061100-generic, x86_64: installed
xrt-amdxdna/2.18.0, 6.8.0-47-generic, x86_64: installed

So it seems like it has installed for the kernels available on 6.8. After switching back to 6.11 it showed TEST PASSED!. Why is the build is failing on the newer kernels?

maxzhen commented 3 weeks ago

I don't think your issue is related to AMD xdna driver. It seems you are having issues with compiling amdgpu driver?

lalilaloe commented 3 weeks ago

It occurs when running

# Start XDNA driver release build
./build.sh -release
maxzhen commented 3 weeks ago

Please share the full log from "./build.sh -release". I cannot imagine how building XDNA driver can trigger compilation failure on amdgpu driver.

lalilaloe commented 3 weeks ago

During dkms autoinstall I'll try to retrieve the logs. The build script seems to work now on 6.11 kernel after previous build.