allegroai / clearml-agent

ClearML Agent - ML-Ops made easy. ML-Ops scheduler & orchestration solution
https://clear.ml/docs/
Apache License 2.0
232 stars 90 forks source link

Feature request: support for PDM package manager #191

Open OldaKodym opened 6 months ago

OldaKodym commented 6 months ago

PDM is a modern package manager that works similarly to poetry but in my experience its way faster for larger distribution like pytorch and it also supports additional nifty stuff like centralized package caches.

Beyond the amazing stuff you guys are already putting out there, are you considering adding other package managers that gain a significant userbase?

ainoam commented 6 months ago

Thanks for suggesting @OldaKodym.

We're always keeping an eye on the ML development ecosystem to help ClearML users streamline their work with their favorite tools.

We'll pin this to our roadmap. Having this here will help push it forward as more users support this request 😃

OldaKodym commented 6 months ago

@ainoam Cool, thanks for the response!

In the meantime, it would be great if there was general support for pyproject.toml-based build process instead of always using pip install requirements.txt. That would let me at least use the pip build frontend with custom build backends like pdm-backend. However, right now if I try to specify a path to a project folder with Task.add_requirements(), it fails because any existing path gets interpreted as a requirements file.

The things is I can actually get it to work if I want the agent to build a local module using the pyproject.toml, by doing the following:

Task.ignore_requirements("my_module")
Task._force_requirements[
        "my_module @ file:///${PROJECT_ROOT}/path/to/my_module"
    ] = None

But it feels quite hacky. It would be nice to have an option similar to running Task.force_requirements_env_freeze() that would allow me to specify the pyproject.toml file instead of requirements file.

ainoam commented 6 months ago

@OldaKodym pyproject.toml is already supported via [poetry] (https://github.com/allegroai/clearml-agent/blob/95dde6ca0cac717d2094114699c11bd1f0d38040/docs/clearml.conf#L78).

Just configure the agent to use poetry as package manager and it will ignore the "installed packages" and use the pyproject.toml inside the repo (and of course update back for visibility)

OldaKodym commented 6 months ago

@ainoam The issue is that poetry is, surprisingly, not PEP 621 compliant (uses [tool.poetry] instead of standard [project] table), so the poetry build frontend cannot be used to install actual PEP-compliant pyproject.toml files that all the other tools use. I can do that with any other build frontend including pip, but not poetry and therefore, not with clearml-agent.

Basically, the option to run pip build frontend with pyproject.toml file will enable installing projects built with any PEP-compliant build backend like pdm, hatch and any others. I believe this enables quite a few use cases, with the added benefit of compatibility with standard tools like pip(env) and setuptools.

ainoam commented 6 months ago

Thanks for the detailed information @OldaKodym. Is PDM a "drop in" replacement for poetry i.e. same frontend interface to install/run?

OldaKodym commented 6 months ago

@ainoam the basic frontend interface is similar. Running pdm install in project root will create a venv (optionally), read or create a lockfile and install the packages, then pdm run python can be used to run scripts using the installed env. There are of course some differences in usage such as poetry show vs pdm list but the tools are similar.