Open hauntsaninja opened 1 year ago
I feel like this is related. In https://github.com/python/typeshed/pull/9511 I tried to use lxml-stubs
, but can't because it's not explicitly mentioned as a dependency of openpyxl
. Only et-xmlfile
is (which itself is based on the is based upon the xmlfile module from lxml, but that's besides the point).
Seeing https://github.com/python/mypy/pull/14737 makes me wanna request allowing those for stub_uploader as well, but that's only growing the list. (unless you wanna directly depend on, and trust, mypy's non_bundled_packages
)
On the note of more relaxed criteria and catching more non-wheel declared dependencies without running arbitrary code, maybe https://github.com/typeshed-internal/stub_uploader/pull/88 would help ? It still won't catch all cases (including mine mentioned above), but I already know of at least two cases in typeshed were it would be beneficial.
Currently, EXTERNAL_REQ_ALLOWLIST
is a free-for-all. I.e. once a package is on the allowlist, every other package can depend on it. It would be safer to specify exactly which stub packages can depend on which other packages. This would also allow us to be a bit looser in some cases. For example, the implications of types-obscure-pkg depending on other-obscure-pkg are lower than types-requests being able to depend on obscure-pkg.
We should also consider moved the allowlist out of the metadata.py file into a data file to separate code from data.
For example, the implications of types-obsure-pkg depending on other-obscure-pkg are lower than types-requests being able to depend on obscure-pkg.
Although there is the danger of a hacked maintainer account adding a dependency on types-obscure-pkg
to types-requests
. This case needs to be considered.
Okay, here is a suggestion. We allow:
Edit 2: Also, on upload of a package with outside dependencies, we reverse check whether there are other typeshed packages that depend on this that don't have the required outside dependencies in their allowlist.
A dependency on other typeshed packages if those only have (recursively) no outside dependencies.
To illustrate the implications of this proposal: currently 6 typeshed packages depend on types-requests
:
types-caldav
types-docker
types-hvac
types-requests-oauthlib
types-slumber
types-tensorflow
All of these currently pull in urllib3
due to types-requests
depending on urllib3
, but only types-docker
explicitly lists urllib3
as an external dependency in its METADATA.toml
file.
To clarify the proposal here: would we still allow these typeshed packages to depend on types-requests
if (and only if) they explicitly also listed urllib3
in their requirements, and we had urllib3
in their per-package allowlists at stub-uploader?
To clarify the proposal here: would we still allow these typeshed packages to depend on
types-requests
if (and only if) they explicitly also listedurllib3
in their requirements, and we hadurllib3
in their per-package allowlists at stub-uploader?
At least the latter, yes. They don't necessarily need to list it in their METADATA requirements, as there's no necessity to make it into their dependency field in the package.
I can see the attraction of making allowlists per-package. The way that urllib3
has become an indirect dependency of so many stubs packages shows how easy it is for packages to become significant attack vectors (without us realising) once they've been added to the allowlist. For the specific case of urllib3
, the ecosystem is probably going to have much bigger problems than the security of typeshed packages if that package is compromised in some way -- but it's a useful example. And as you say, if allowlists are per-package, then adding an entry to the allowlist doesn't have nearly the same implications in terms of security.
It will probably mean more busywork for us to maintain per-package allowlists in a separate repo, but I think it's probably worth it.
We could also add back a general allowlist for very popular packages in addition to the per-package allowlist if this proves to be too much busywork at a later date. If we put the allowlist into an external file, we should keep that possibility in mind when deciding on a format.
Currently, we're quite strict about what external dependencies stubs can use. We have a small allowlist here: https://github.com/typeshed-internal/stub_uploader/blob/b81ba3c5214667f1dfb1decb0c97b83214553833/stub_uploader/metadata.py#L166
Briefly, the reason for this is a) Python packages can execute arbitrary code at install time, b) type checkers sometimes install stub packages automatically, c) stub packages are quite popular, d) users likely expect stub packages to be inert
So what would it take to remove the allowlist? We already have an important additional check in stub uploader: we ensure that external dependencies of a stub package must be a dependency of the upstream package. However, there's still a hole, in that stub dependencies of stub packages are not currently checked.
To spell things out, this is the scenario that we're concerned about:
Once we plug this hole, we could maybe get rid of the allowlist or have much more relaxed criteria.
This has been discussed in a couple places, mainly https://github.com/typeshed-internal/stub_uploader/pull/61#discussion_r979316520. I'm writing this up here as a way to easily communicate the current status quo to typeshed contributors.
Plugging the hole is a pretty easy change to make to stub_uploader, see diff here:
However, typeshed currently has twelve stubs that fail this test:
So it may be that there's some reasonable improvement that could be made to the check that works for these twelve packages. Or maybe we can just require these specific stubs to have individualised exceptions committed to stub_uploader. Or maybe it's not super viable to plug this hole and we need to keep the allowlist forever.
(Also note that the implementation of the check in that diff^ isn't perfect, since it only works for projects that are on PyPI and have wheels, and only checks the dependencies of the latest version)