The "ELF patching" or rewriting of the .dynamic and .interp sections in shared libraries and executable files is a common step in preparing software for deployment in target environments. The records in these sections control the assumptions the program makes during program loading about the environment it's executed in. The .interp section tells a dynamically linked program where to find ld.so, the "dynamic linker" responsible for program loading. The DT_NEEDED and DT_RUNPATH entries in the .dynamic section give linker the cues about which dynamic libraries provide the required dependencies, and where in the filesystem they could be found.
For example, when one builds a native program using CMake, the built program stores DT_RUNPATH records with locations of all of the dependencies in the development environment. This way, the developer may easily run local tests. However, when the program is being installed, CMake will strip^1 these records and re-link the program, because the locations of the dependencies may be different on the user's machine.
Conda and Spack package managers make extensive use^2 of the "dynamic string token" $ORIGIN in DT_RUNPATH entries in order to create relocatable binaries: executables and shared libraries that can be moved to and loaded from an arbitrary root in the file system, as long as one preserves the relative paths between the distributed files.
Nix and GNU Guix make loading of dynamic programs deterministic and provably correct by linking them directly to concrete revisions of their dependencies they were built to run with. This is done by recording in the dynamic structure the absolute paths of the dependencies, which are always deployed in unique locations pre-computed in a deterministic fashion.
ELF patching is used in more ad hoc ways too. For example, the official Pytorch wheels often package^4 vendored copies of certain native libraries, including but limited to libogmp.so or CUDA and cuDNN shared libraries. In order to avoid conflicts with other versions of the same libraries, their path names are augmented with unique suffices, and all of the DT_NEEDED attributes are updated to reflect this.
Distributing software linked against CUDA
The CUDA and cuDNN End-User License Agreements (EULAs) specifically allow[^5][^6] redistributing certain NVidia-owned files together with the Licensees' applications, as required for deployment on the target hosts. The licenses also explicitly grant application developers the right to redistribute files whose path names reflect versioning and host architecture information, which are required for the dynamic linker to choose the right binary.
The EULAs do not mention the possibility of updating the dynamic structures (the .dynamic and .interp sections) of the NVidia-owned binaries in order to prepare them for execution on the users' machines. Thus the default assumption is that such updates may constitute a "modification" which would prohibit redistribution of the finalized artifacts. For this reason, NVidia had to grant^7 Anaconda and conda-forge an exclusive permission to patchelf the toolkit files, before CUDAToolkit and cuDNN could be published in the conda repositories, and before PyTorch with CUDA support could be made available to users.
This is also the underlying reason that distributions such as Nixpkgs and Spack currently choose not to provide public binary cache for software that links to CUDA, cuDNN, or components of the NVidia HPC SDK. This implies that the consumers interested in the unique correctness guarantees or the security and supply chain inspectability properties these systems provide have to invest in their own build infrastructure capable of handling at times prohibitively heavy builds, such as when building PyTorch and Tensorflow for use with CUDA devices.
Proposal
NVidia could enable all of these new applications of CUDA and cuDNN libraries by integrating the patchelf exception, tested and recognized by NVidia and Anaconda across the years, into the respective EULA texts. This means updating the licenses to explicitly permit updating, with the purpose of communicating assumptions about the environment, the dynamic linker cues in the ELF dynamic structures, namely the .interp section, as well as DT_NEEDED, DT_RPATH, and DT_RUNPATH entries in the .dynamic section.
FAQ
Please do suggest how to update this text!
Could such exception be abused to bypass NVidia's software restrictions, e.g. gain access to datacenter GPUs' features on consumer-grade devices?
Not in any meaningful way. This issue is specifically concerned with the cudatoolkit and cuDNN libraries, not with the libcuda.so userspace driver. Additionally, such an exception would not allow touching the .text sections, where the actual executable code resides.
My project or company is also affected!
Please consider issuing a public statement and inking it in the comments. Also consider reaching out at nvidia-compute-license-questions@nvidia.com, as suggested by the CUDA EULA, and linking this issue and your statement.
Edit history
2023-10-30: Clarified that the libcuda.so driver is out of the scope for this issue
Rewriting ELF
The "ELF patching" or rewriting of the
.dynamic
and.interp
sections in shared libraries and executable files is a common step in preparing software for deployment in target environments. The records in these sections control the assumptions the program makes during program loading about the environment it's executed in. The.interp
section tells a dynamically linked program where to findld.so
, the "dynamic linker" responsible for program loading. TheDT_NEEDED
andDT_RUNPATH
entries in the.dynamic
section give linker the cues about which dynamic libraries provide the required dependencies, and where in the filesystem they could be found.For example, when one builds a native program using CMake, the built program stores
DT_RUNPATH
records with locations of all of the dependencies in the development environment. This way, the developer may easily run local tests. However, when the program is being installed, CMake will strip^1 these records and re-link the program, because the locations of the dependencies may be different on the user's machine.Conda and Spack package managers make extensive use^2 of the "dynamic string token"
$ORIGIN
inDT_RUNPATH
entries in order to create relocatable binaries: executables and shared libraries that can be moved to and loaded from an arbitrary root in the file system, as long as one preserves the relative paths between the distributed files.Nix and GNU Guix make loading of dynamic programs deterministic and provably correct by linking them directly to concrete revisions of their dependencies they were built to run with. This is done by recording in the dynamic structure the absolute paths of the dependencies, which are always deployed in unique locations pre-computed in a deterministic fashion.
ELF patching is used in more ad hoc ways too. For example, the official Pytorch wheels often package^4 vendored copies of certain native libraries, including but limited to
libogmp.so
or CUDA and cuDNN shared libraries. In order to avoid conflicts with other versions of the same libraries, their path names are augmented with unique suffices, and all of theDT_NEEDED
attributes are updated to reflect this.Distributing software linked against CUDA
The CUDA and cuDNN End-User License Agreements (EULAs) specifically allow[^5][^6] redistributing certain NVidia-owned files together with the Licensees' applications, as required for deployment on the target hosts. The licenses also explicitly grant application developers the right to redistribute files whose path names reflect versioning and host architecture information, which are required for the dynamic linker to choose the right binary.
The EULAs do not mention the possibility of updating the dynamic structures (the
.dynamic
and.interp
sections) of the NVidia-owned binaries in order to prepare them for execution on the users' machines. Thus the default assumption is that such updates may constitute a "modification" which would prohibit redistribution of the finalized artifacts. For this reason, NVidia had to grant^7 Anaconda and conda-forge an exclusive permission to patchelf the toolkit files, before CUDAToolkit and cuDNN could be published in the conda repositories, and before PyTorch with CUDA support could be made available to users.This is also the underlying reason that distributions such as Nixpkgs and Spack currently choose not to provide public binary cache for software that links to CUDA, cuDNN, or components of the NVidia HPC SDK. This implies that the consumers interested in the unique correctness guarantees or the security and supply chain inspectability properties these systems provide have to invest in their own build infrastructure capable of handling at times prohibitively heavy builds, such as when building PyTorch and Tensorflow for use with CUDA devices.
Proposal
NVidia could enable all of these new applications of CUDA and cuDNN libraries by integrating the patchelf exception, tested and recognized by NVidia and Anaconda across the years, into the respective EULA texts. This means updating the licenses to explicitly permit updating, with the purpose of communicating assumptions about the environment, the dynamic linker cues in the ELF dynamic structures, namely the
.interp
section, as well asDT_NEEDED
,DT_RPATH
, andDT_RUNPATH
entries in the.dynamic
section.FAQ
Please do suggest how to update this text!
Could such exception be abused to bypass NVidia's software restrictions, e.g. gain access to datacenter GPUs' features on consumer-grade devices?
Not in any meaningful way. This issue is specifically concerned with the cudatoolkit and cuDNN libraries, not with the
libcuda.so
userspace driver. Additionally, such an exception would not allow touching the.text
sections, where the actual executable code resides.My project or company is also affected!
Please consider issuing a public statement and inking it in the comments. Also consider reaching out at nvidia-compute-license-questions@nvidia.com, as suggested by the CUDA EULA, and linking this issue and your statement.
Edit history
libcuda.so
driver is out of the scope for this issueLinks
CC https://github.com/NVIDIA/build-system-archive-import-examples/issues/3 https://github.com/NVIDIA/build-system-archive-import-examples/issues/5 https://github.com/NixOS/nixpkgs/pull/76233
[^6]: "2. Distribution" in https://docs.nvidia.com/deeplearning/cudnn/sla/index.html#supplement