intel / intel-data-center-gpu-driver-for-openshift

Intel Data Center GPU Drivers for Red Hat OpenShift Container Platform
https://catalog.redhat.com/software/containers/intel/intel-data-center-gpu-driver-container/6495ee55c8b2461e35fb8264
Apache License 2.0
7 stars 5 forks source link

docker: Copy all in-tree and out-of-tree modules to final image #47

Closed hershpa closed 1 year ago

hershpa commented 1 year ago

This is a short-term solution. Long-term solution: https://github.com/intel/intel-data-center-gpu-driver-for-openshift/issues/49 Signed-off-by: Hersh Pathak hersh.pathak@intel.com

hershpa commented 1 year ago

Analysis: The original implementation with install -D does not preserve the directory structure and instead copies all files under 1 directory. This PR preserves directory structure at the cost of increased image size. We could add RM command to remove files that are unnecessary following this COPY command.

An attempt was made to try using cp -r instead of install -D but the results showed the same behavior. One more idea was considered: make INSTALL_MOD_PATH=/build/modules INSTALL_MOD_STRIP=1 modules_install This however puts the .ko files directly in $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE) INSTALL_MOD_STRIP could be a future optimization for stripping debug symbols from module.

Another thing to note is i915.ko dependencies include several .ko.xz files which are in-tree, meaning they are part of default kernel. The reason we have to copy them to /opt is because we run modprobe -d /opt and the .ko.xz files are only in /lib by default. Thus, the added complexity.

Either approach is reasonable in terms of keeping what we have or building on this PR. Since we are close to release, it makes sense to keep changes minimal and push optional optimization to a future release.

hershpa commented 1 year ago

@chaitanya1731, @mregmi, @uMartinXu

uMartinXu commented 1 year ago

Analysis: The original implementation with install -D does not preserve the directory structure and instead copies all files under 1 directory. This PR preserves directory structure at the cost of increased image size. We could add RM command to remove files that are unnecessary following this COPY command.

An attempt was made to try using cp -r instead of install -D but the results showed the same behavior. One more idea was considered: make INSTALL_MOD_PATH=/build/modules INSTALL_MOD_STRIP=1 modules_install This however puts the .ko files directly in (INSTALLMODPATH)/lib/modules/(KERNELRELEASE) INSTALL_MOD_STRIP could be a future optimization for stripping debug symbols from module.

Another thing to note is i915.ko dependencies include several .ko.xz files which are in-tree, meaning they are part of default kernel. The reason we have to copy them to /opt is because we run modprobe -d /opt and the .ko.xz files are only in /lib by default. Thus, the added complexity.

Either approach is reasonable in terms of keeping what we have or building on this PR. Since we are close to release, it makes sense to keep changes minimal and push optional optimization to a future release. looks like it might exists a potential gap. if some OOT Driver module depends on multiple intree modules, the modules.dep command should be created in the intree driver modules directory on the host , so that can make sure when modprobe the OOT driver module, the proper dependent intree modules on the host can be insmod. Looks like using current solutions, people has to copy all the dependent modules into opt/lib directly and then run depmod, even the same intree module files are already on the host.

hershpa commented 1 year ago

Updated PR commit and description.

uMartinXu commented 1 year ago

This patch looks good to me.