python / typeshed

Collection of library stubs for Python, with static types
Other
4.36k stars 1.75k forks source link

Maybe change METADATA.toml "requires"? #8312

Open hauntsaninja opened 2 years ago

hauntsaninja commented 2 years ago

The context is I'm looking into #5768.

There's a lot of logic we have that either assumes or checks that entries in "requires" are typeshed stub packages. It seems potentially useful to split "requires" into "requires_typeshed" (a list of required packages that originate from typeshed) and "requires_external" (a list of external dependencies). Doing so could e.g. help avoid issues where we have typos in some requires_typeshed deps or potentially implicitly trust packages that start with "types-". Could also make it easier to build tooling (e.g. knowing what we could put on MYPYPATH to test against latest typeshed).

srittau commented 2 years ago

I'd prefer to keep just one requires field. We ship METADATA.toml with the types package in question and for external users the source of the dependency should not matter and seems rather artificial. For internal purposes it seems rather trivial to check whether a dependency is internal.

hauntsaninja commented 2 years ago

I checked on grep.app and I think we're the only people who use requires, so not sure we should worry much about external users. I guess how much tooling and security benefit this would bring is debatable, but at the minimum this would be nice typo protection.

Anyway, your opinion should be the default here, so I won't push for this change. Are you on board with the rest of https://github.com/typeshed-internal/stub_uploader/pull/59, assuming requires_typeshed becomes a derived property computed from filtering the requires field of METADATA.toml? I think adding something to the Metadata class is the sanest way to make sure all the old call sites that assume all deps come from typeshed only see typeshed deps. If not, I'll just close it out

srittau commented 2 years ago

I checked on grep.app and I think we're the only people who use requires, so not sure we should worry much about external users.

I agree that this is more a theoretical concern at the moment.

Anyway, your opinion should be the default here, so I won't push for this change.

Well, your suggestion got a lot of upvotes, so it's quite possible I'm in the minority here, and it's also not an issue I'll fight tooth and nail over. So, I'd be interested what other people think.

Are you on board with the rest of https://github.com/typeshed-internal/stub_uploader/pull/59

While I haven't reviewed the PR in detail yet, because of the issues under discussion, the general idea seems sound to me. And it definitely makes sense to distinguish between "internal" and "external" dependencies.

AlexWaygood commented 2 years ago

I'd be interested what other people think.

I like @hauntsaninja's idea best.

If we adopted @Akuli's idea in https://github.com/python/typeshed/issues/5768#issuecomment-1251375258, it would reduce the concern regarding typos, since we could fairly easily assert that all dependencies either have names starting with types-, or are explicitly listed as dependencies of the runtime package in question. However, I still prefer @hauntsaninja's proposal. There's a tonne of places scattered across our test suite where we assume for various reasons that all packages listed in requires are internal typeshed packages. I like the simplicity of having the typeshed dependencies listed separately, so we don't have to separate out the typeshed dependencies from the non-typeshed dependencies in every test script.

AlexWaygood commented 2 years ago

It also feels slightly more "principled", to me, to list the typeshed-internal dependencies separately, rather than assuming that all types-* packages originate with typeshed (unless types-* names are reserved for typeshed somehow?)

hauntsaninja commented 2 years ago

To be clear, we should never make assumptions based on the types- prefix. I assume srittau's suggestion was to partition requires based off of listing the stubs directory in typeshed or checking https://github.com/typeshed-internal/stub_uploader/blob/main/data/uploaded_packages.txt in stub_uploader.

Fwiw, the two differ in two cases: stubs permanently removed from typeshed and stubs newly added to typeshed. In stub_uploader we should probably only use uploaded_packages.txt to avoid sketchy things like race conditions where someone adds something to typeshed then squats the distribution before stub_uploader uploads it or bugs caused by things happening out of order when stubs are deleted (e.g. if we continue to have types-cryptography deps after deleting cryptography stubs).

AlexWaygood commented 1 year ago

@hauntsaninja, are you still interested in pursuing this, or shall we close this for now?

hauntsaninja commented 1 year ago

I think this is still worth doing, especially so now that we're starting to have typeshed packages that aren't just types-.

(Of course, stub_uploader will still need to validate, like it does today, and if we have any typeshed CI workflows with write permissions we'd also need to validate)