pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.37k stars 2.98k forks source link

Add `--only-deps` (and `--only-build-deps`) option(s) #11440

Open flying-sheep opened 1 year ago

flying-sheep commented 1 year ago

https://github.com/pypa/pip/issues/11440#issuecomment-1445119899 is the currently agreed upon user-facing design for this feature.


What's the problem this feature will solve?

In #8049, we identified an use case for installing just the dependencies from pyproject.toml.

As described in the solution section below --only-deps=<spec> would determine all dependencies of <spec> excluding that package itself and install those without installing the package. It could be used to

  1. allow specifying environment variables that are active only while building a package of interest (without having it be active while potentially building its dependencies).
  2. separate dependency installation from building and installing a package, allowing to rebuild a package in a docker build while the dependency installation step is loaded from cache.

This example shows both use cases:

# copy project metadata and install (only) dependencies
COPY pyproject.toml /myproj/
WORKDIR /myproj/
RUN pip install --extra-index-url="$PIP_INDEX" --only-deps=.[floob]

# copy project source files, build in a controlled environment and install our package
COPY src/mypkg/ /myproj/src/mypkg/
RUN env SETUPTOOLS_SCM_PRETEND_VERSION=2.0.2 python3 -m build --no-isolation --wheel
RUN pip install --no-cache-dir --no-dependencies dist/*.whl

Instead of the solution from #8049, @pradyunsg prefers a solution similar to the one below: https://github.com/pypa/pip/issues/8049#issuecomment-1079882786

Describe the solution you'd like

One of those two, or similar:

  1. (used in the example above)

    --only-deps would work like -r in that it’s not a flag globally modifying pip’s behavior but a CLI option with one argument that can be specified multiple times. Unlike -r it accepts a dependency spec and not a path to a file containing dependency specs.

    Where pip install <spec> first installs all dependencies and then (build and) install the package referred to by the spec itself, pip install --only-deps=<spec> would only install the dependencies.

  2. --only-deps would work like --[no|only]-binary, in that it requires an argument specifying what package not to install. A placeholder like :requested: could be used, e.g.:

    pip install --only-deps=:requested: .[floob]

Alternative Solutions

Additional context

NA

Code of Conduct

pradyunsg commented 1 year ago

Thanks for filing this @flying-sheep!

I wonder if it would be better for --only-deps to mirror how --no-deps behaves.

flying-sheep commented 1 year ago

If I interpret the spartan docs for --no-deps correctly, it doesn’t take an argument. I added motivation to “Alternative Solutions” on why I think --only-deps should accept an argument. Do you disagree? If yes: why? If no, (and you didn’t change your mind) it seems like I didn’t understand what you mean: in what way should --only-deps work like --no-deps?

uranusjr commented 1 year ago

There’s actually yet another possibility: make --only-deps a global switch that takes one single value, like --no-binary and --only-binary.

I think I personally like --only-deps=<names> best, followed by --only-deps=<spec> (proposed in this PR), and a value-less --only-deps last (and we really should find a way to make --no-deps work the same as --no-binary).

flying-sheep commented 1 year ago

I see, those options are modifiers that pick out individual packages from the flattened dependency list and modify pip’s behavior towards that ones. So one would do:

cd mypkg  # project dir
pip install --only-deps=mypkg .[floob]

I think it makes sense regarding consistency with --[no|only]-binary, but isn’t 100% practical, as the only use case that came up so far is the one above, so users will always have to specify both dist name and relative path to the project.

uranusjr commented 1 year ago

It may make sense to create wild card names e.g. :requested: to simplify the UX somewhat. --no-binary etc. have :all:, which of course does not make sense for --only-deps, but can be a good inspiration.

flisboac commented 1 year ago

This would be extremely useful to prepare lambda layers, or any kind of "pre-provided" environment, whilst keeping the exact requirements (including locks) properly versioned in a git repository. The target environment could then be replicated with ease, e.g. when developing locally, or when testing.

Follows an example pyproject.toml:

[build-system]
requires = [
    "setuptools >= 45",
    "wheel",
]
build-backend = "setuptools.build_meta"

[project]
name = "my-lambda"
requires-python = ">= 3.7"
version = "0.1.0"

# Again, this is just an example!
[project.optional-dependencies]
provided = [
    "typing-extensions >= 4",
    "requests ~= 2.23.0",
    "requests_aws4auth ~= 0.9",
    "boto3 ~= 1.13.14",
    "certifi >= 2020.4.5.1",
    "elasticsearch ~= 7.7.0",
    "elasticsearch_dsl ~= 7.2.0",
    "aws_requests_auth ~= 0.4.2",
]
pre-commit = [
    'nox >= 2022.1',
    'pytest >= 7.1.2',
    'black[d] >= 22',
    'mypy >= 0.950',
    'pre-commit >= 2.17.0',
    'flake8 >= 4; python_version >= "3.8"',
    'flake8 < 4; python_version < "3.8"',
    'pydocstyle[toml] >= 6.1.1',
    'isort >= 5.10.1',
]

Then, when creating a new "provided" environment (e.g. a lambda layer):

# Must be run in a similar environment as the target one.
# Advantage of this over `pip download` is the ability
# of mixing source and binary distributions, whenever
# necessary (e.g. downloading both numpy and pyspark).
# Could also add locks/pinning, via `--constraint`.

mkdir -p dist/python
pip3 install \
  .[provided] \
  --target dist/python \
  --only-deps=:requested:
( cd dist && zip ../dist.provided.zip ./python )

And in a development or CI-like environment:

# May be cached.
python3 -m venv venv
source ./venv/bin/activate

# Gets all development tools, and anything used to run
# automated tasks.
# Could also add locks/pinning, via `--constraint`.
./venv/in/pip3 install -e .[provided,pre-commit]
flying-sheep commented 1 year ago

@uranusjr Sure, I’m not married to the semantics I suggested. I’m fine with your design. Now we just need someone to implement it lol.

sbidoul commented 1 year ago

Would it be reasonable to have --only-deps work only when exactly one top level requirement is provided, and fail otherwise?

pradyunsg commented 1 year ago

Would it be reasonable to have --only-deps work only when exactly one top level requirement is provided, and fail otherwise?

Yes.

rgommers commented 1 year ago

I'm not sure I understand the :requested: and trying to pick out individual packages. My $2c here is that there are multiple keys in pyproject.toml, and those keys are the only things to be selecting on. The keys are: dependencies, requires (under [build-system]), and then optional-dependencies which may have multiple entries. It'd be great to spell out how all those get installed. Something like:

pip install . --only-deps  dependencies
pip install . --only-deps  requires
pip install . --only-deps  doc  # for a "doc" key under optional-dependencies

And the two comments above say that if you need two of those, you need multiple invocations of pip rather than providing two key names at once?

pradyunsg commented 1 year ago

pip install . --only-deps doc

IMO, pip install --only-deps .[doc] has clearer syntax, especially when not using a PEP 621 style pyproject.toml-based dependency declaration.

None the less, I agree that we don't need the additional complexity here.

rgommers commented 1 year ago

Okay, so:

pip install --only-deps  .  # dependencies
pip install --only-deps  .[doc]  # optional-dependencies `doc`

And just to confirm, pip install . --only-deps .[requires] for the build requirements (meaning it's the optional-dependencies syntax, but requires is reserved and people shouldn't use a requires key under optional-dependencies)?

sbidoul commented 1 year ago

I'd suggest a small variation, to avoid overloading the [extras] syntax.

pip install --only-deps  .  # project.dependencies
pip install --only-deps  .[doc]  # project.dependencies and project.optional-dependencies.doc
pip install --only-build-deps .  # build-system.requires
netsandbox commented 1 year ago

Maybe there are situations where you only want to install the optional dependencies without the project dependencies. So I would extend @sbidoul suggestion with:

pip install --only-optional-deps  .[doc]  # only project.optional-dependencies.doc
sbidoul commented 1 year ago

Maybe there are situations where you only want to install the optional dependencies without the project dependencies.

I'm not comfortable with that. Indeed extras are additive to the base dependencies by definition, so such a mechanism sounds a bit awkward to me.

brettcannon commented 1 year ago

The desire for this feature just came up in a discussion at work around whether dependencies should be recorded in requirements.txt (since that's where pip has historically put them) or in pyproject.toml (as that's backed by a spec)? And the problem with the latter is it requires you set up a build system so you can do pip install -e ., even if you didn't need one and just want to install stuff that you wrote down somewhere.

The motivating scenario is beginners who have started coding, have some script or package going (i.e., not doing a src/ layout), and now they want to install something (assume we will create a virtual environment for them). We would like to teach people to record what they install and follow standards where we can, but to use pyproject.toml we need to also get a build system chosen and set up which is another level of complicated. Otherwise we would need to forgo standards and go with requirements.txt and decide what the appropriate worfklow is in that scenario (e.g., do you record in requirements.txt, requirements-dev.txt, dev-requirements.txt, requirements/dev.txt, etc.?).

dstufft commented 1 year ago

Without a build system there's no way for pip to determine what the dependencies are, pip doesn't (afaik, and it shouldn't) read dependencies from pyproject.toml (other than build dependencies), it just looks for a build system and then asks the build system what the dependencies are. It is the responsibility of the build system to read pyproject.toml and determine what (if any) dependencies there are.

uranusjr commented 1 year ago

@brettcannon In the use case, what is the motivation behind not including . in the installation?

brettcannon commented 1 year ago

Without a build system there's no way for pip to determine what the dependencies are

If dependencies was listed in dynamic I would agree, but if it's not then how it is different than a requirements file?

It is the responsibility of the build system to read pyproject.toml and determine what (if any) dependencies there are.

Are you saying we need to come up with a separate standard to list arbitrary dependencies like requirements files allow for (this is not the same a lock file to me; think of it as the input file of your top-level dependencies for a lock file generation tool)?

what is the motivation behind not including . in the installation?

Beginners don't typically need it (i.e., they aren't doing a src/ layout), and I personally don't want to try and educate a beginner trying to install something on why they suddenly need to select a pure Python build backend to record what they installed and how to make an appropriate decision.

pfmoore commented 1 year ago

If dependencies was listed in dynamic I would agree, but if it's not then how it is different than a requirements file?

Yes, it's technically true that if the pyproject.toml states that the dependencies are non-dynamic, then reading that file for the data is valid. But during the discussions on PEP 621, the idea of relying on metadata from pyproject.toml like this was pretty controversial. I don't have a specific reference, but I do recall being told that the intended use was very much for backends to read the data, not for it to be the canonical source (I proposed PEP 643 mainly because reading pyproject.toml from the sdist was "not the intended use").

I understand the use case, and in a practical sense, getting the data from pyproject.toml would work (until someone wants it to work with dynamic dependencies, and we have a debate over why that is out of scope). But I don't think it's the right way for pip to go.

Is there a reason this couldn't be an external tool?

# Warning: code has only been lightly tested!
with open(FILE, "rb") as f:
    data = tomllib.load(f)

if "project" not in data:
    raise ValueError("No PEP 621 metadata in pyproject.toml")
if "dependencies" in data["project"].get("dynamic", []):
    raise ValueError("Dependencies cannot be dynamic")

deps = data["project"].get("dependencies")

if deps:
    cmd = [sys.executable, "-m", "pip", "install", *deps]
    subprocess.run(cmd)
brettcannon commented 1 year ago

Is there a reason this couldn't be an external tool?

Everything can be an external tool 😉, but at least for VS Code we have a general policy of trying not to introduce our own custom tooling so people can operate outside of VS Code without issue. So if we created an install-from-pyproject and started to record what (primarily) beginners wanted to be installed in pyproject.toml, then that workflow suddenly becomes rather specific to us as we are now driving the workflow instead of the user.

If the answer from pip is, "use requirements files," then I totally understand and accept that as that's pip's mechanism for this sort of thing. But it also means that I will probably have to develop some new standard for feeding dependencies into a lock file tool since right now it seems like all that tool could take is what's provided on the command-line (although ironically pip-tools now works with pyproject.toml, so using the file for this sort of thing might be decided for us 😅).

uranusjr commented 1 year ago

My impression reading from the above is the real hurdle is actually making the code a proper Python package (with the build system etc. defined), not to install only dependencies of a Python package. The latter is in itself an entirely separate valid use case, but can be more properly covered by external tooling. So maybe what we really need is an accompanying file format that mirrors PEP 621, but does not express a Python package by forbidding dynamic (everything must be static), and making everything else advisory (no required fields, even the file name can be different). That way we can have say pip install --dependencies /path/to/pyproject.toml to read dependencies and optional-dependencies.

dstufft commented 1 year ago

I was thinking about this more, and it occurred to me that I don't think it actually exists to have a pyproject.toml without a build system.

PEP 518 says that it is expected if the build-system.requires key is missing, that tools will treat that as if ["setuptools", "wheel"] were defined.

PEP 517 says that if build-system.build-backend isn't defined, then tools will treat that as if the project is using the legacy setup.py path, either by directly invoking setup.py or using setuptools.build_meta:__legacy__.

This it is my assertion that any directory that has a pyproject.toml, implicitly has a build backend of setuptools, and this matches what is implemented today in pip.

Likewise, since setuptools implements PEP 621, the following pyproject.toml is a valid pyproject:

[project]
name = "test"
version = "1.0"
dependencies = ["requests"]

So I guess, in a way, what @brettcannon wants exists already (other than the --only-deps flag), and it's implemented with the abstraction layers still being clean. I'm not sure if "implicitly use setuptools" counts as having to teach beginners about build backends or not?

Also note that it's not a valid PEP 621 file if it doesn't have a name and version specified (either dynamically or statically for version, and only statically for name). This means that it's not possible to create a valid pyproject.toml that uses project.dependencies without making it into a minimal valid package.

pip-tools now works with pyproject.toml, so using the file for this sort of thing might be decided for us

pip-tools isn't reading the pyproject.toml, it's calling the build backend (by default setuptools) and asking it to produce a list of dependencies, and then generating a lockfile from that.

sbidoul commented 1 year ago

don't want to try and educate a beginner trying to install something on why they suddenly need to select a pure Python build backend to record what they installed and how to make an appropriate decision.

You don't necessarily need to tell beginners about build backends since there is a default one. A pyproject.toml without build system and only name, version and dependencies is valid and easy to teach.

pip install --only-deps needs to use the build backend to obtain dependencies in the general case. It could also read static dependencies from pyproject.toml, but that is an optimization / implementation detail.

I think the only drawback is these pesky .egg-info directories that show up, but I understand setuptools has long terms plans to get rid of them in some cases? https://github.com/pypa/setuptools/issues/3573#issuecomment-1539728150

So personally I think pyproject.toml is definitely the way to go to declare top level dependencies. BTW, in my practice, requirements*.txt are the lock files, not top level dependencies.

sbidoul commented 1 year ago

Ow, looks like I wrote this at the same time as @dstufft :)

brettcannon commented 1 year ago

Also note that it's not a valid PEP 621 file if it doesn't have a name and version specified (either dynamically or statically for version, and only statically for name). This means that it's not possible to create a valid pyproject.toml that uses project.dependencies without making it into a minimal valid package.

Correct, but setting version = "0" and name to the directory in this new file called pyproject.toml that suddenly appears probably isn't too hard of a stretch to understand on its own.

I'm not sure if "implicitly use setuptools" counts as having to teach beginners about build backends or not?

Somewhat. I can already see the bug report, "Why is VS Code installing my own project when I didn't ask it to?!?" because the debug output has "Successfully installed spam-0" from the setuptools output (and that's skipping over the whole "Building wheel" bits).

I guess my question comes down to what do you expect apps to use these days to record their dependencies (and thus to install them)?

dstufft commented 1 year ago

To be clear, I don't really have any problem with a --only-deps (for whatever value my opinion has on the matter). I didn't think we should try and make pyproject.toml into something that wasn't describing a package. I honestly think it's fine if we just tell people that everything is a package, Rust does that just fine with cargo.

dstufft commented 1 year ago

Absent that though, requirements.txt is probably still it, and if we want something standardized, it has yet to be created.

brettcannon commented 1 year ago

Absent that though, requirements.txt is probably still it, and if we want something standardized, it has yet to be created.

That's what I thought this little digression was going to end up concluding with.

rgommers commented 1 year ago

So personally I think pyproject.toml is definitely the way to go to declare top level dependencies.

+1 for this, this is the one standardized place and it seems perfectly adequate. And I believe the conversation in this thread had already settled on both that add on the need for --only-deps (discussion was mostly around the optimal syntax for it), before the little detour in the last 1-2 days.

pip install --only-deps needs to use the build backend to obtain dependencies in the general case. It could also read static dependencies from pyproject.toml, but that is an optimization / implementation detail.

I would say it's a little more than an implementation detail. There is no interface to ask the build backend for this information, so reading static dependencies and optional-dependencies directly clearly seems like the way to go (why make it a complex/new interoperability interface when the data is already right there in a static file?). Dynamic dependencies can just raise an error, they're not supportable by either pyproject.toml or requirements.txt.

brettcannon commented 1 year ago

reading static dependencies and optional-dependencies directly clearly seems like the way to go (why make it a complex/new interoperability interface when the data is already right there in a static file?). Dynamic dependencies can just raise an error, they're not supportable by either pyproject.toml or requirements.txt.

That's actually what I'm starting to think as well. And at worst VS Code can extract the dependencies from pyproject.toml and manually run pip with the dependencies listed on the command-line (which falls within our "making something easier, but not magical and custom to VS Code" rule).

dstufft commented 1 year ago

There isn't a new interface, the existing interface to read that metadata already exists... PEP 517's prepare_metadata_for_build_wheel.

I don't understand why using the already standardized mechanisms for this is unsuitable.

rgommers commented 1 year ago

@dstufft that is (1) an optional hook that (2) writes out .dist-info files, and (3) may trigger a build. From PEP 517: _If a build frontend needs this information and the method is not defined, it should call build_wheel and look at the resulting metadata directly._

(1) and (3) together make using that interface a non-starter, and (2) is an unexpected side-effect. In case it's not clear why it's a nonstarter, beyond "triggering a build from pip install --only-deps is clearly not what the user asks for here": a key reason to even want --only-deps is to set up a dev environment for editable/isolated builds. So needing to build before setting up that dev env is kinda circular.

sbidoul commented 1 year ago

Sure but from a pip implementation point of view, obtaining metadata for a source tree is done by asking the metadata preparation interface (which currently delegates entirely to the build backend). So anyone implementing pip install --only-deps should not break that abstraction.

Enriching the abstraction to shortcut the backend when the metadata we need is statically declared in pyproject.toml is a useful optimization, but an optimization nevertheless, IMO. [edit] And it is an optimization that would benefit many other use cases than --only-deps.

rgommers commented 1 year ago

I'm sorry, but that just does not make much sense to me even conceptually. Compare these two actions to take:

  1. "prepare all the metadata needed to build this package"
  2. "install dependencies of this package into the active environment"

Those are nothing alike in semantics, (2) has literally nothing to do with building the package itself. It's about installing dependencies only. You can keep those dependencies in requirements.txt and do pip install requirement.txt, or you can do pip install --only-deps. The latter means reading from pyproject.toml instead of requirements.txt; it's very similar and pyproject.toml is only preferred because that is the canonical place we decided to list these dependencies.

If you'd want the backend to be involved, it'd need a hook like read_static_dependencies which has the desired behavior (read static dependencies and return a dict of them, raise if deps are not static). This is just more convoluted that reading directly from pyproject.toml, and otherwise pretty much equivalent.

So there are several different reasons here:

  1. Has the wrong semantics conceptually
  2. May trigger a lengthy build
  3. Not equivalent to installing from a requirements file

Sure but from a pip implementation point of view

I'd say that you're reasoning from current pip-internal implementation details for one concept, towards that dictating user-facing syntax and semantics for another concept. That's not how software should be designed.

I honestly also don't understand the hesitation with reading from pyproject.toml. What is the point of having a standard file for metadata if pip refuses to read metadata from it when it needs to, and instead keep people on the legacy non-standardized requirements.txt? It goes against everything we're trying to achieve with standardizing things.

sbidoul commented 1 year ago

You probably look at pip install --only-deps for a specific use case. If we make it break when dependencies are dynamic, I bet we'll immediately get complaints that the feature is partially implemented.

I'd say that you're reasoning from current pip-internal implementation details for one concept, towards that dictating user-facing syntax and semantics for another concept.

Nope :) I'm looking at the high level expectations of what pip install --only-deps should do, and consistency of the pip UX, i.e. get the dependencies of the project and install them. Standards say these may be dynamic so if we want the feature to work correctly in all cases, we have to involve the backend in the general case.

All I'm saying is that implementing pip install --only-deps and reading static dependencies from pyproject.toml are two independent work streams, the latter benefiting the former. It will satisfy more use cases and keep the pip code base cleaner at the same time.

I don't think there is an hesitation with reading from pyproject.toml. There is none from me at least. There is a lack of resources for implementing everything pip should have in a correct manner, though.

rgommers commented 1 year ago

You probably look at pip install --only-deps for a specific use case. If we make it break when dependencies are dynamic, I bet we'll immediately get complaints that the feature is partially implemented.

Okay. It's highly uncommon for them to be actually dynamic (not just dynamic in the "read in from requirements.txt" sense), and seems like a thing that could sensibly be left out. There is no equivalent feature in pip at all for requirements.txt now after all.

If you do want that, then it definitely needs a new hook, because prepare_metadata_for_build_wheel is definitely the wrong one (e.g., it will not work when dependencies are static but version is dynamic - which is very common). And a new hook is a lot of work. So if you want to block it completely on standardizing a new hook first, okay so be it. My expectation is then that we'll end up with nothing at all here.

pfmoore commented 1 year ago

From my perspective, I don't see anything massively wrong with a feature that says "read a project source tree and install only the packages that the source tree indicates are needed as runtime dependencies for the project".

My problem is that the feature is currently underspecified. In particular:

  1. No-one has explicitly confirmed that failing if pyproject.toml says that the dependencies are dynamic is acceptable. I'm not talking about ignoring that case, I'm asking if it's OK to explicitly document that pip will fail in that case, and that's intended behaviour (because the feature must not trigger a build) rather than something we might add later.
  2. Similarly, if we're installing from a sdist, is it OK to read the metadata from PKG-INFO file and refer to the Dynamic field from that? What if the dependencies are dynamic in the pyproject.toml but static in PKG-INFO? What if the metadata is older than version 2.2 (hence everything is dynamic by implication) but pyproject.toml doesn't mark the dependencies as dynamic?
  3. Presumably if we're installing from a wheel, we look at the wheel metadata. After all, there's not going to be a pyproject.toml in the wheel.
  4. If we're installing from an index using a project name, is it OK for the behaviour to change over time if the project uploads a wheel?
  5. What about editable installs? In many implementations of editable installs, the backend injects a support library into the runtime dependencies. So pip install --only-deps .; pip install --no-deps -e . will fail. Is that what people expect/want?

Someone has to decide all of these things, and they all affect the UX.

What I don't want is for us to implement a feature like this only to have it not actually do what the intended audience want it to do. Which is why I'm arguing for people implementing this via external processing. If people use something like the script I described above and it's successful, that gives us a clear indication that a limited implementation is sufficient - without committing pip to anything in advance. Conversely, if no-one is willing to use a small wrapper script, that suggests it's not worth the effort of thrashing out these sorts of questions in order to implement something in pip.

Would it help if I turned the above script into a pipx-runnable project on PyPI, so people didn't have to type it out themselves?

dstufft commented 1 year ago

Okay. It's highly uncommon for them to be actually dynamic (not just dynamic in the "read in from requirements.txt" sense), and seems like a thing that could sensibly be left out. There is no equivalent feature in pip at all for requirements.txt now after all.

I think it's very common? Every project released prior to 2020 by definition has them as dynamic. I suspect a non trivial number of projects after 2020 didn't immediately move all of their dependencies to be defined statically.

The proposed --only-deps flag thus far has been to work with arbitrary dependency specifiers, which means it supports things that have been uploaded to PyPI. It would be very weird and surprising to users IMO if this one particular flag randomly didn't work due to implementation details of the packaging of the project they're attempting to install the deps for.

rgommers commented 1 year ago

Thanks for identifying a part of the disconnect @pfmoore.

I read this as the feature proposed, from the first line of the issue description (bold font added by me):

In #8049, we identified an use case for installing just the dependencies from pyproject.toml.

In this discussion, it seems to me like all of the folks outside of the pip devs who are participating (and also folks in gh-8049) are talking about needs for installing:

Whether it's in VS Code or a shell, it all seems similar to me in that it's really from pyproject.toml, not from sdist/wheel/PyPI/whatever-else. And that the --only-deps very clearly says "don't trigger a build nor install the package itself".

@flying-sheep, @brettcannon or others, please correct me if I'm misinterpreting your needs.

I think it's very common? Every project released prior to 2020 by definition has them as dynamic.

This is then by definition out of scope. These are not dynamic requirements to me, they're just repos or source trees without a pyproject.toml, and hence not of interest.

My problem is that the feature is currently underspecified. In particular:

To answer those questions then:

  1. Yes, it's OK to explicitly document that pip will fail in that case, and the desired behavior. Do not attempt to build the package, ever is a hard requirement that I thought was clear, but implicit.
  2. No, out of scope to install from sdist
  3. No, also out of scope
  4. No, also out of scope
  5. Editable installs are also out of scope. The whole point is to install only dependencies and not the package itself. So don't touch the package in any way, hence whether a future install is editable or not is irrelevant here.

Conversely, if no-one is willing to use a small wrapper script, that suggests it's not worth the effort of thrashing out these sorts of questions in order to implement something in pip.

It's not the effort (which is indeed not large), it's that to actually use this one would have to publish it, maintain it long-term, and use that in the docs of the tool of interest (say VS Code for Brett, NumPy/SciPy for me). This feature is not for us personally, it's for our large-ish audiences. It feels a bit like something that is or could turn into a fork of the pip UX, which is why I didn't consider it.

Would it help if I turned the above script into a pipx-runnable project on PyPI, so people didn't have to type it out themselves?

Not sure, maybe? Would you like a thing like that to become expanded with optional dependencies handling, perhaps even build dependencies, and be used semi-permanent?

dstufft commented 1 year ago

This is then by definition out of scope. These are not dynamic requirements to me, they're just repos or source trees without a pyproject.toml, and hence not of interest.

Then it shouldn't be a flag that takes a dependency specifier.

It is a terrible UX IMO if pip install foo works fine, but pip install --only-deps=foo foo randomly does not.

rgommers commented 1 year ago

It is a terrible UX IMO if pip install foo works fine, but pip install --only-deps=foo foo randomly does not.

Then it's good that --only-deps=foo was not proposed:) See this comment from @sbidoul with the many thumbs-up's as what I thought our common understanding on this one was.

dstufft commented 1 year ago

Then it's good that --only-deps=foo was not proposed:) See https://github.com/pypa/pip/issues/11440#issuecomment-1445119899 from @sbidoul with the many thumbs-up's as what I thought our common understanding on this one was.

That doesn't change my comment at all?

Why should pip install . work but pip install --only-deps . not work due to implementation details of .? Why should pip install foo work but pip install --only-deps foo not work due to implementation details of foo?

The poor experience comes because the proposed --only-deps flag is modifying a list of package specifiers (either directly, as in the original proposal, or indirectly a in @sbidoul's comment). Package specifiers point to packages from all kinds of sources, remote repositories, whatever. Users are 100% going to be confused and complain when implementation details of packages cause this command not to work, because the expectation is that if I'm referring to a package, then it will work.

This doesn't hold true for -r requirements.txt, because -r is not operating on a generic package specifier, it is operating on a requirements file, so there is no expectation that it will work with something that isn't a requirements file.

This could be modified by doing something like pip install --only-deps pyproject.toml and making it work like -r requirements.txt... though in that case I would still say it should go through the build backend [^1], because people are going to be confused if otherwise valid pyproject.toml files don't work with this flag. It however wouldn't need to work with packages that don't have a pyproject.toml at all in that case, because then the flag is operating on a pyproject.toml file, not on a generic package specifier.

An alternative thing to the above, is saying that --only-deps won't take itself value, but will make pip install error if it's given anything but a local file path to a source tree. But like the pip install --only-deps pyproject.toml option, I still think it should go through a build backend (and in this case, I think it should work with builds that don't have a pyproject.toml, to the extent that pip install . does).

If the assertion is that it can never go through the build backend, then I think pyproject.toml is the wrong place for this. Build backends are explicitly the interface that tools like pip are intended to use to get information out of pyproject.toml, and I don't think it's OK for pip to work in only some of the cases that pyproject.toml supports.

What's not OK, IMO, is having a pip install ... command that sometimes works, and sometimes doesn't, based on the contents of an otherwise valid pyproject.toml, especially if that pyproject.toml is inside of some already created sdist somewhere.

[^1]: Though it could of course read pyproject.toml and if it detects static dependencies opt not to do that as a performance optimization.

rgommers commented 1 year ago

If the assertion is that it can never go through the build backend,

That is not my assertion. I said it can never trigger a build, that's not at all the same. A build backend hook with the desired behavior could be designed, it just doesn't exist today.

then I think pyproject.toml is the wrong place for this. Build backends are explicitly the interface that tools like pip are intended to use to get information out of pyproject.toml,

I think we're going to have to agree to disagree on this statement for now. I think pyproject.toml has grown into the standard place for all sorts of metadata, and certainly not to be used only by a build backend (and Paul's just-proposed thin pipx-installable wrapper is an example right in this issue about a front-end reading it).

I think this difference of interpretation about what pyproject.toml is for is causing other recent misalignments as well, e.g., in https://discuss.python.org/t/relaxing-or-clarifying-pep-621-requirements-regarding-dynamic-dependencies/24752/14..

pfmoore commented 1 year ago

OK, I think @dstufft is right about the pip UI - we don't want to say "install only the dependencies of project <X>", as that will cause confusion for users who think that this implies that when specifying <X>, they can use any of the ways they are normally able to specify what to install - sdist, wheel, specifier, URL, etc.

If this is a way of saying "get a list of dependencies to install from a file", then it parallels requirements files, and we should be looking at a UI something like pip install --reqs-from ./pyproject.toml. I don't like the option name here, and I'm open to discussion on most of the details, but the core point to me is we should be designing a UI that makes it clear that pip is reading a file that lists a set of dependency specifiers, and absolutely not "working out what the project's dependencies are" (except by implication that what's in pyproject.toml is that, as long as certain conditions hold).

To be honest, I'd even be willing to consider --requirements ./pyproject.toml, where the requirements file parser is extended to detect a TOML file, and change its behaviour accordingly.

One other thing we would have to consider, though, is how we avoid people getting the impression that this is a replacement for requirements files - or even worse, that it's the "PyPA recommended" approach for doing what people do with requirements files now. There's no support in pyproject.toml for the various pip options you can specify in requirements files, and any request to add that will be met with "nope, use a requirements file if you need that" (from me, at least!). I don't want to see a bunch of stuff getting proposed for a [tool.pip] section in pyproject.toml.

To be clear, I'm still not particularly keen on this whole proposal, and I'd honestly be much happier if we just dropped the idea. But if people want to pursue this, the above are my thoughts on how we should do it.

pfmoore commented 1 year ago

I think this difference of interpretation about what pyproject.toml is for is causing other recent misalignments as well

It was something that caused questions when PEP 621 was being debated. The PEP itself states "When metadata is specified using this PEP, it is considered canonical", which clearly indicates that reading data from pyproject.toml is allowed, but the "when" is (IMO) important - it equally clearly indicates that consumers cannot fail just because a project doesn't specify all its metadata in pyproject.toml.

Most of the confusion seems to arise when people want to ignore problematic cases like dynamic dependencies. Which for adhoc tools and project-specific scripts, is fine (that's the usage I understood the PEP to be allowing, when I approved it), but when it's in the context of a general tool like pip, or a discussion of behaviour that we're considering standardising, we can't just ignore the possibility.

Personally, I'd be perfectly happy to consider only static dependencies. But I suspect many people would scream if we made declaring dependencies dynamic illegal, or if we prohibited editable installs that needed support libraries.

dstufft commented 1 year ago

I would note, that I don't personally have a problem with pip install --only-deps <list of packages>, as long as it works for all of the various types of things that people expect to put into <list of packages>, and I would expect it to install only the deps of the <list of packages>, and not any of <list of packages>.

I just don't think we can arbitrarily say that we're going to implement pip install --only-deps <list of packages>, but only some packages work here.

brettcannon commented 1 year ago

I think the difference here between what e.g., @flying-sheep , @rgommers , and I are proposing (and you were spot-on, Ralf, in representing at least my viewpoint), is we want to use pyproject.toml as a possible replacement for requirements files (when possible; I know the feature sets are not equivalent). That means we view pyproject.toml as the input and the output is installing what's listed in there without doing any buliding, just like what -r requirements.txt will do for you today.

But what I'm hearing from @pfmoore and @dstufft is they are viewing this as a feature to install the dependencies of anything that pip can extract a list of dependencies from, i.e., sdists and wheels as well. To me those are two different use cases.

Speaking for myself, I want a way to write dependencies in pyproject.toml for something that will never be built into a wheel and to be able to install those dependencies I wrote them down in pyproject.toml. Whether that's because this is homework for a class and so I will be submitting source in a specific way to my teacher, I'm just learning and so writing down dependencies is just for cleanliness/bookkeeping in case I have to reconstruct my environment but my code will never leave my machine, or I'm developing an app that will be packaged using some other tool (and something Donald has blogged about as the use case of requirements.txt files). In all of these scenarios, requirements files have traditionally fulfilled this need (at least in the simple case). But what I think some of us are proposing here is leaning on pyproject.toml to start taking over that use case from requirements files where it makes sense.

dstufft commented 1 year ago

The reason that I am viewing this as a feature to install the dependencies of anything that pip can extract a list of dependencies from, is because the different proposals that I've seen so far, all logically are acting of lists of arbitrary dependency specifiers being passed into the pip install command.

I go back to this proposed command: pip install --only-deps .

Without the proposed flag, pip install ... can be passed an arbitrary number of package specifiers, where a package specifier can refer to things things like local source tree checkouts, git repos, sdists (local and remote), wheels (local and remote), etc.

The proposal then says (to my reading), that as soon as you start to pass --only-deps to the pip install ... command, suddenly pip install no longer accepts any old package specifier, now only certain package specifiers are acceptable [^1]. This is partially because "shape" of the --only-deps flag being proposed does not match the "shape" of the -r requirements.txt flag, instead it matches --no-deps and --(only|no)-binary, which neither of those flags fundamentally change the fact that pip will accept any old package specifier in it's list of things to install. However also just on it's own, these sort of "change the accepted types for one option, because you passed another option" tend to always confuse people.

As I said above, from a pip UI perspective, if you're trying to turn pyproject.toml into a replacement for requirements.txt, and for whatever reason you are not willing to have it go through PEP 517's prepare_metadata_for_build_wheel (falling back to build_wheel, then I think it the API has to be designed in a way that more closely mimics how -r requirements.txt works rather than how --no-deps works. This could be by extending -r to support being passed a path to a pyproject.toml file, it could be by adding a new flag (or a new command) that says something like --deps-from ./pyproject.toml or something. I don't particularly have a favorite API shape here, but for another reason...

From a packaging ecosystem perspective, I think the desire to turn pyproject.toml into a replacement for requirements.txt is wrong. It is not legal to have a pyproject.toml file that has a project.dependencies specified but does not have a project.name and a project.version (or a build backend that will fill it in). That means that people are forced to inject a random meaningless name/version in contexts where it doesn't make sense. But worse to me is that project.dependencies is fundamentally a place for abstract dependencies, requirements.txt is fundamentally a place for concrete dependencies. This means that it's not actually possible to express things from a requirements.txt inside of a pyproject.toml [^2].

For instance, how would you translate this requirements.txt to a pyproject.toml:

--extra-index-url https://cs101.example.com/lesson01libs/

lessonlib==1.0
tqdm

Into a pyproject.toml?

What about this one?

-e https://github.com/pypa/packaging.git#egg=packaging

I understand the desire to use pyproject.toml, it's already there so it's attractive to just piggyback off of it. However it serves a fundamentally different purpose that requirements.txt, and any attempt to combine them is going to result in either the pyproject.toml case or the requirements.txt (or both) having to make sacrifices in usability to make it work.

That being said, if that is what people want, than that feature should at least be designed so that it makes sense in the larger pip UI.

[^1]: And worse to me, is that I think it's extremely not obvious what makes a package specifier acceptable vs unacceptable without introspecting the actual thing that is being referred to and reading files within it to determine if it satisfies that use case or not. [^2]: It's kind of weird to me that my blog post is being referenced in support of mixing the use cases for pyproject.toml and requirements.txt, when I thought my blog post made it very clear that I didn't believe you should mix the use cases of setup.py and requirements.txt, and pyproject.toml is effectively a replacement for setup.py, so you can largely s/setup.py/pyproject.toml/ and the blog post still makes sense I think.

pfmoore commented 1 year ago

we want to use pyproject.toml as a possible replacement for requirements files

I'm very surprised to see this. As @dstufft said, the difference between requirements files and project dependencies is pretty well-established by now, with a lot of people linking to his article discussing the differences. And pyproject.toml is very clearly aligned with project dependencies and project metadata. So "replacing requirements files with project metadata" seems like an anti-pattern to me. Add to that the fact that the specification of pyproject.toml (requiring a name and version) ties it pretty closely to actual projects which get built into sdists/wheels, and this all seems very much like a re-purposing of pyproject.toml in a way that will ultimately cause more confusion, rather than less.

Having said that, the current packaging ecosystem doesn't handle "projects" that are anything other than libraries distributed as wheels at all well. So things like

don't fit the existing model very well, but we end up trying to force them to work, simply because there's no other model that is sufficiently well-developed to support the user.

As a matter of "practicality beats purity" I can see why "put the data in pyproject.toml and use existing tools" makes sense. But as the maintainer of one of the key tools here, I don't like the implications of trying to cover supporting new workflows without thinking things through properly[^1]. And to be honest, the more people try to argue that pyproject.toml is an appropriate way of recording this information, the more they are persuading me of the exact opposite - that "practicality" is the wrong choice here.

Theory aside, what's the way forward here?

It sounds like the use case we've identified is "get a list of requirements to install from pyproject.toml". I already proposed a possible syntax, using a new option that explicitly takes the name of a PEP 621 format file to read the requirement data from. If that doesn't satisfy the requirement[^2] here, then can someone explain why not? If it does, maybe we can focus on that, rather than on "installing the dependencies of a project".

There's still a bunch of questions we need to answer even if we do agree that some sort of --reqs-from-pyproject ./pyproject.toml is the right UI. Things like "why is it OK that this option doesn't have the additional features that requirements files do?" and "are we sure that the limitations of dependency specifiers (e.g., no way of using relative directory names) aren't going to be an issue"? And I'd rather spend time on doing that (and as a result, refining our understanding of the identified use case) than on debating abstract questions like "what is a project" or "how do requirements and dependencies differ".

[^1]: Personally, I feel we already have enough problems from extending pip's scope (years ago, at this point) from "installing stuff" to include "build workflow" and we're still trying to deal with the implications of that even after all this time. [^2]: I love discussions that need to use the word "requirement" with 2 different meanings 🙁