pdm-project / pdm

A modern Python package and dependency manager supporting the latest PEP standards
https://pdm-project.org
MIT License
7.97k stars 409 forks source link

pdm add / install - Update pdm.lock file incorrectly #2696

Open lancetipton opened 8 months ago

lancetipton commented 8 months ago

Make sure you run commands with -v flag before pasting the output.

Steps to reproduce

  1. Create a new folder, and run pdm init
  2. Update the pyproject.toml file to look like this

[project] name = "pdm-bug" version = "0.1.0" description = "Default template for PDM package" dependencies = [] requires-python = "==3.11.*" readme = "README.md" license = {text = "MIT"}

This is the important part

[[tool.pdm.source]] type = "find_links" url = "https://download.pytorch.org/whl/cpu/torch_stable.html" name = "torch"

[tool.pdm.dev-dependencies] pipeline = [ "torch<3.0.0,>=2.2.1", ]

End of important part

[tool.pdm] distribution = false


<br/>

3. Run command `pdm install`
> Odd thing here is it installs packages from the group `pipeline` even though it has not been added?
> I would have though I needed to add the group? Maybe because a `pdm.lock` file does not exist yet?
* The **CPU** version of torch is installed **without** the **nvida / cuda** dependencies
* The torch package in the `pdm.lock` file looks like:

![image](https://github.com/pdm-project/pdm/assets/8345382/a1575280-0ed5-46e1-938d-54499b3ef79f)

<br/>

4. Run command `pdm add sentence-transformers -dG pipeline`
* The `pdm.lock` is updated, but the `torch` package has been updated to now include **nvida / cuda** dependencies
* It doesn't have to be the `sentence-transformers` package, it can be any package and it still happens

![image](https://github.com/pdm-project/pdm/assets/8345382/7e3345bd-27d0-4376-a126-dee20b5b393f)

## Actual behavior

* The **nvida / cuda** dependencies are added to the `torch` package even though it was originally installed without them

## Expected behavior

* The **nvida / cuda** dependencies are **NOT** added to the `torch` package when it was originally installed without them

<br/>

## Environment Information

```bash
# Paste the output of `pdm info && pdm info --env` below:
PDM version:
  2.12.4
Python Interpreter:
  /Users/LanceTipton/repos/pdm-bug/.venv/bin/python (3.11)
Project Root:
  /Users/LanceTipton/repos/pdm-bug
Local Packages:

{
  "implementation_name": "cpython",
  "implementation_version": "3.11.3",
  "os_name": "posix",
  "platform_machine": "arm64",
  "platform_release": "22.3.0",
  "platform_system": "Darwin",
  "platform_version": "Darwin Kernel Version 22.3.0: Mon Jan 30 20:39:46 PST 2023;
root:xnu-8792.81.3~2/RELEASE_ARM64_T6020",
  "python_full_version": "3.11.3",
  "platform_python_implementation": "CPython",
  "python_version": "3.11",
  "sys_platform": "darwin"
}
mpaluch92 commented 6 months ago

This is exactly what frustrates me as well 🫠 @lancetipton Could you find any workaround until this issue gets fixed?

lancetipton commented 6 months ago

@mpaluch92 - I did come up with a script as a workaround that handles it for my use case. It's not perfect, but it does keep the unwanted dependencies out of the pdm.lock file.

Downside is the unwanted dependencies are still installed any time you add a new package

Upside is they are not saved to the pdm.lock file, so they are not installed in other locations like my CI environment

Here's the script if you want it. You'll have to customize it a bit to your needs, but hopefully it's straight forward enough:


try:
  import tomlkit
  noTomlKit = False
except:
  noTomlKit = True

from pathlib import Path

dirPath = Path(__file__).parent
rootDir = str((dirPath / '../').resolve())
lockLoc = (rootDir + '/pdm.lock')

config = {  
  'lock': {
    'packageKey': 'package',
  },

  'remove': [
    "nvidia-cublas-cu12",
    "nvidia-cuda-cupti-cu12",
    "nvidia-cuda-nvrtc-cu12",
    "nvidia-cuda-runtime-cu12",
    "nvidia-cudnn-cu12",
    "nvidia-cufft-cu12",
    "nvidia-curand-cu12",
    "nvidia-cusolver-cu12",
    "nvidia-cusparse-cu12",
    "nvidia-nccl-cu12",
    "nvidia-nvjitlink-cu12",
    "nvidia-nvtx-cu12",
    "triton",
  ],
  'deps': [
    {
      'name': 'torch',
      'remove': [
        "nvidia-cublas-cu12",
        "nvidia-cuda-cupti-cu12",
        "nvidia-cuda-nvrtc-cu12",
        "nvidia-cuda-runtime-cu12",
        "nvidia-cudnn-cu12",
        "nvidia-cufft-cu12",
        "nvidia-curand-cu12",
        "nvidia-cusolver-cu12",
        "nvidia-cusparse-cu12",
        "nvidia-nccl-cu12",
        "nvidia-nvjitlink-cu12",
        "nvidia-nvtx-cu12",
        "triton",
      ]
    }
  ]
}

def loadLockFile():
  """Loads the PDM lockfile to be updated"""

  content = Path(lockLoc).read_text()
  return tomlkit.loads(content)

def saveLockFile(lockFile):
  """Save the update lockfile"""
  updated = tomlkit.dumps(lockFile)
  Path(lockLoc).write_text(updated)

def removePackages(lockFile={}, config={}):
  """Remove top level packages from the lockfile"""

  packages = lockFile.get('package')

  remove = config.get('remove')
  updated = []
  for package in packages:
    if package.get('name') not in remove:
      updated.append(package)

  lockFile['package'] = updated

  return lockFile

def removeDeps(lockFile={}, config={}):
  """Remove child dependencies from packages in the lockfile"""

  packages = lockFile.get('package')

  updated = []
  cdeps = config.get('deps')
  for cdep in cdeps:
    name = cdep.get('name')
    remove = cdep.get('remove')
    for package in packages:
      if package.get('name') != name:
        updated.append(package)
        continue

      cleaned = []
      deps = package.get('dependencies')
      for dep in deps:
        isMatch = False
        for item in remove:
          if dep.startswith(item):
            isMatch = True
            break

        if not isMatch: cleaned.append(dep)

      package['dependencies'] = cleaned
      updated.append(package)

  lockFile['package'] = updated

  return lockFile

def main(config={}):
  if noTomlKit:
    return print('[Warning] Can not fix lockfile, missing required tomlkit dependency')

  lockFile = loadLockFile()
  updatedLock = removePackages(lockFile=lockFile, config=config)
  finalLock = removeDeps(lockFile=updatedLock, config=config)
  saveLockFile(lockFile=finalLock)

main(config=config)

Then update your pyproject.toml file to include this in the [tool.pdm.scripts] section:

# ... rest of config file

[tool.pdm.scripts]
# ... other scripts
post_add = {cmd = "python path/to/fixLock.py"}
post_lock = {cmd = "python path/to/fixLock.py"}
post_install = {cmd = "python path/to/fixLock.py"}

# ... rest of config file

Note I added the tomlkit check as a warning because I install different pacakges based on the environment where its running In some cases (Docker), I need to run pdm install --frozen-lockfile, but I don't want to include unused packages in the image, in this case tomlkit. So it just prints a warning, and exists with a 0 status.

mpaluch92 commented 6 months ago

@lancetipton Awesome! Thank you very much ❤️