lmstudio-ai / venvstacks

Virtual environment stacks for Python
http://venvstacks.lmstudio.ai/
MIT License
174 stars 6 forks source link

Edit distribution RECORD files instead of deleting them #28

Open ncoghlan opened 1 month ago

ncoghlan commented 1 month ago

Deleting distribution RECORD files entirely was the simplest way to resolve the issues they can cause with archive reproducibility: when they contain references to script files with shebangs that are rewritten at installation time, their contents are implicitly dependent on the absolute path to the build environment (since the shebang line gets rewritten).

An improved approach would be to delete just the lines corresponding to the deleted files, rather than deleting the entire RECORD file (as the current approach means that features like importlib.metadata.packages_distributions won't work in deployed environments).

To implement this approach, importlib.metadata can be used to get a complete list of every file belonging to every distribution in an environment, and the csv module can be used to edit the RECORD files (alternatively, for build time usage, the installer project offers a higher level interface for handling RECORD file updates in https://installer.pypa.io/en/latest/api/utils/#installer.utils.construct_record_file and https://installer.pypa.io/en/latest/api/records/).