pypa / pyproject-hooks

A low-level library for calling build-backends in `pyproject.toml`-based project
https://pyproject-hooks.readthedocs.io/
MIT License
122 stars 49 forks source link

Enforce names of keyword arguments to prevent build backends from using arbitary parameter names #201

Closed notatallshaw closed 4 months ago

notatallshaw commented 4 months ago

This is a follow up to https://discuss.python.org/t/are-build-backends-that-use-different-keyword-arguments-to-the-standard-spec-compliant/53056

The idea is that pyproject-hooks should enforce the names of keyword arguments by calling them by keyword argument instead of positionally, e.g.

hook(metadata_directory=metadata_directory, config_settings=config_settings)

Instead of the current:

hook(metadata_directory, config_settings)

Because this would be a breaking change perhaps try the former and fallback to the later with a warning?

I would be happy to raise a PR for this.

pfmoore commented 4 months ago

The conclusion in that thread was that the backend signature is defined in such a way that it can be called with either named arguments (using the names given in the spec) or positional arguments. So I don't think this change is needed (and I don't think pyproject-hooks needs to validate the signature of the backend hook, its role is simply to call it on behalf of the build frontend).

notatallshaw commented 4 months ago

I didn't understand that to be the conclusion.

If pyproject-hooks doesn't enforce the keywords then it's very easy for build back ends to make this mistake.

Which means other front end systems basically need to "do what pyproject-hooks does" instead of follow the spec.

So either we accept that pyproject-hooks implementation is the defacto spec, as uv has had to do, update the spec, or update pyproject-hooks.

I did not interpret the conclusion of that thread that we accept pyproject-hooks implementation as the defacto spec.

pfmoore commented 4 months ago

The required hook signature is as described in the spec:

prepare_metadata_for_build_wheel(metadata_directory, config_settings=None)

which means that callers can supply positional or keyword arguments, and config_settings is optional.

Based on that specification, this project is calling the hook correctly. The question isn't whether the call is correct, but whether pyproject-hooks should validate the hooks as well as just wrapping them. I don't believe that it should, but ultimately that's a decision for @takluyver to make.

I did not interpret the conclusion of that thread that we accept pyproject-hooks implementation as the defacto spec.

Nor did I. The conclusion of the thread was that enscons is at fault here for not implementing the spec correctly. Therefore, the correct solution is to fix enscons. IMO changing this project to also trigger the error in enscons is an over-reaction.

pradyunsg commented 4 months ago

I do wish we had some form of compliance suite for build backends. I don't think it belongs in this project though.

notatallshaw commented 4 months ago

The question isn't whether the call is correct, but whether pyproject-hooks should validate the hooks as well as just wrapping them

I think "validate" is a misrepresentation of what I am asking for here, for it to be full validation it must check that the argument are both positional and keyword compliant, and I am not asking for that.

I am asking for pyproject-hooks to enforce the parameter names because I beleive this will better force backends to follow the spec for two reasons:

  1. It is easier to get parameter names wrong than the position of parameters (e.g. typos)
  2. The ecosystem has been forced to get the positions correct because pyproject-hooks already implements it that way

Maybe my argument is weak here and you think getting the position of the parameter wrong is equally likely, that would be fair, it isn't my personal experience with writing Python though.

Nor did I. The conclusion of the thread was that enscons is at fault here for not implementing the spec correctly

I am not disagreeing with that, but you can't have your cake and eat it, the vast majority of the ecosystem is designed to build against pyproject-hooks, so in effect anywhere the spec can not be followed but pyproject-hooks still works and is easy to make a mistake has significant chance of the back backends not following the spec. So you can't say pyproject-hooks isn't the defacto spec when it is will not enforce certain properties of the spec, because as is, you end you with libraries like uv having to follow what pyproject-hooks does, even though they were previously following the spec.

Now that said, I don't mind if pyproject-hooks is called out as the reference implementation, and that other tools are told to follow pyproject-hooks, and also maybe I'm missing somewhere else where actual validation could be implemented.

pfmoore commented 4 months ago

you end you with https://github.com/astral-sh/uv/pull/3517, even though they were previously following the spec.

I don't think that's the case though. uv doesn't have to follow what this project does, it has to cater for a bug (that's been reported) in enscons. We'd have the same problem here, we can't change this project to be strict until enscons is fixed. And if you want tools to complain about the problem, why not start with uv, which as a newer project doesn't have the same level of backward compatibility pressure? "My project won't build with uv" should be just as compelling a problem for enscons, while not being a sudden breakage for people who've been happily using pip (or other pyproject-hooks clients) for ages.

notatallshaw commented 4 months ago

And if you want tools to complain about the problem, why not start with uv, which as a newer project doesn't have the same level of backward compatibility pressure?

If I was in charge of uv I would throw warnings or refuse to build.

But I am not in charge of uv, and they have their own motivations and incentives which are real for them and force them to follow what pyproject-hooks does, as is evidenced by them following what pyproject-hooks does.

takluyver commented 4 months ago

I broadly agree with @pfmoore here - a spec compliant backend needs to match both the naming and the order of parameters (and have default values where the spec says it should). So I don't think that either passing positional or keyword args is clearly preferable.

In the absence of a clear 'right' choice, I think stability should swing it: keep doing what we already do because the alternative creates more space for confusion (e.g. encsons < X won't work with pyproject-hooks > Y).

I agree that it's probably valuable to have some way of checking the hook signatures more thoroughly than this. I'm ambivalent about whether that belongs in this project or not - I don't think it needs to be, but on the other hand it may be such a small amount of code that it's better maintained here than making another package. :shrug:

pfmoore commented 4 months ago

If we did want to validate signatures, I’d say that we should directly check the function signature (via inspect.signature) rather than by trying different ways of calling the hook. But beyond that, I’m mostly indifferent about whether we do it (call it -0 if you want to put it in vote terms).

notatallshaw commented 4 months ago

Thanks for your consideration, I appreciate the point about stability. I'll close this issue, and hopefully someone can.cone up with a better idea. Perhaps there needs to be a test suite or something that build back ends can validate themselves against.