NVIDIA / gds-nvidia-fs

NVIDIA GPUDirect Storage Driver
Other
202 stars 31 forks source link

nvidia-fs module failing to compile for Linux kernel 6.1 and 6.2 #13

Open ned2 opened 1 year ago

ned2 commented 1 year ago

When trying to install either kernel 6.1 and 6.2, the nvidia-fs module fails to build. I discovered this while upgrading from Ubuntu 22.10 to 23.04.

DKMS make.log for nvidia-fs-2.15.1 for kernel 6.2.12-060212-generic (x86_64)
Sun 23 Apr 2023 13:52:52 AEST
Picking NVIDIA driver sources from NVIDIA_SRC_DIR=/usr/src/nvidia-530.30.02/nvidia. If that does not meet your expectation, you might have a stale driver still around and that might cause problems.
Getting symbol versions from /lib/modules/5.19.0-40-generic/updates/dkms/nvidia.ko ...
Created: /var/lib/dkms/nvidia-fs/2.15.1/build/nv.symvers
checking if uaccess.h access_ok has 3 parameters... no
checking if uaccess.h access_ok has 2 parameters... yes
Checking if blkdev.h has blk_rq_payload_bytes... yes
Checking if fs.h has call_read_iter and call_write_iter... yes
Checking if fs.h has filemap_range_has_page... no
Checking if kiocb structue has ki_complete field... yes
Checking if vm_fault_t exist in mm_types.h... yes
Checking if enum PCIE_SPEED_32_0GT exists in pci.h... yes
Checking if atomic64_t counter is of type long... no
Checking if RQF_COPY_USER is present or not... no
Checking if dma_drain_size and dma_drain_needed are present in struct request_queue... no
Checking if struct proc_ops is present or not ... yes
Checking if split is present in vm_operations_struct or not ... no
Checking if mremap in vm_operations_struct has one parameter... yes
Checking if mremap in vm_operations_struct has two parameters... no
Checking if symbol module_mutex is present... no
Checking if blk-integrity.h is present... yes
Checking if KI_COMPLETE has 3 parameters ... no
Checking if pin_user_pages_fast symbol is present in kernel or not ... yes
make[1]: warning: -j4 forced in submake: resetting jobserver mode.
make[1]: Entering directory '/usr/src/linux-headers-6.2.12-060212-generic'
warning: the compiler differs from the one used to build the kernel
  The kernel was built by: x86_64-linux-gnu-gcc-12 (Ubuntu 12.2.0-17ubuntu1) 12.2.0
  You are using:           gcc-12 (Ubuntu 12.2.0-17ubuntu1) 12.2.0
  CC [M]  /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-core.o
  CC [M]  /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-dma.o
  CC [M]  /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-mmap.o
  CC [M]  /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-pci.o
/var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-mmap.c: In function ‘nvfs_mgroup_mmap_internal’:
/var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-mmap.c:657:67: error: implicit declaration of function ‘prandom_u32’; did you mean ‘get_random_u32’? [-Werror=implicit-function-declaration]
  657 |                 base_index = NVFS_MIN_BASE_INDEX + (unsigned long)prandom_u32();
      |                                                                   ^~~~~~~~~~~
      |                                                                   get_random_u32
/var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-core.c: In function ‘nvfs_init’:
/var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-core.c:2413:29: error: assignment to ‘char * (*)(const struct device *, umode_t *)’ {aka ‘char * (*)(const struct device *, short unsigned int *)’} from incompatible pointer type ‘char * (*)(struct device *, umode_t *)’ {aka ‘char * (*)(struct device *, short unsigned int *)’} [-Werror=incompatible-pointer-types]
 2413 |         nvfs_class->devnode = nvfs_devnode;
      |                             ^
  CC [M]  /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-proc.o
  CC [M]  /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-mod.o
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:252: /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-mmap.o] Error 1
make[2]: *** Waiting for unfinished jobs....
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:252: /var/lib/dkms/nvidia-fs/2.15.1/build/nvfs-core.o] Error 1
make[1]: *** [Makefile:2027: /var/lib/dkms/nvidia-fs/2.15.1/build] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-6.2.12-060212-generic'
make: *** [Makefile:107: module] Error 2

(I couldn't manage to get get the build to use x86_64-linux-gnu-gcc-12 by setting the CC env var, but I get the impression that that's not the underlying issue?) Edit: managed to set the compiler to x86_64-linux-gnu-gcc-12 and still seeing this issue.

ned2 commented 1 year ago

I managed to get the module to compile by making some changes that seems to have made the compiler happy. With the caveat that I'm not a C or Kernel developer by any stripe, so just made some guesses about what to change.

  1. renamed prandom_u32() to get_random_u32() in nvfs-mmap.c
  2. changed the function signature *nvfs_devnode(struct device *dev, umode_t *mode) to *nvfs_devnode(const struct device *dev, umode_t *mode) in nvfs-core.c