srittau commented 5 years ago

This is an alternative to #2440 (disallowing third-party stubs). The idea is that typeshed remains/becomes a central repository for third-party stubs that are not bundled with the parent package, similar to DefinitelyTyped. In the future I expect type checkers will not want to bundle all third-party stubs for a variety of reasons, so third-party stubs would be distributed as separate PEP 561 stub-only packages, one per upstream package.

(I tried to integrate points raised there into this issue, especially those by @JukkaL in this comment.)

Advantages

Due to typeshed's tests, packages in typeshed will continue to work with the latests versions of mypy and pytype.
Basic level of consistency and standards due to review by typeshed maintainers.
Consistent naming scheme for third-party stubs packages, allowing users to just "pip install <guessed name>" and it will work when there are stubs.
Tooling (like tests) is easier to manage as it can remain part of the typeshed package.
Easier to contribute to stubs, since contributors don't need to learn the intricacies of multiple stubs projects.
No need to start a separate project just to distribute stubs for a new package.

Issues

Workload issues for typeshed maintainers.
Typeshed maintainers will often not be familiar with the package for which pull requests are opened.
Publishing stubs takes longer due to the necessary reviews.

Further Considerations

What should the generated packages be called? @ethanhs's PEP 561 actually requires stubs-only package to be named <package>-stubs. typeshed could squat these names and release them (and remove the stubs) on the request of upstream maintainers. Alternatively, typeshed could add a common prefix or suffix (ts, typeshed) or in addition to or instead of the -stubs suffix. This would be in violation of PEP 561, so we'd need to get broader consensus to amend the PEP. My personal favorite would be <package>-ts.

To guarantee a fairly quick turnaround on stubs, to minimize work for publishing stubs, and to prevent all third-party stub packages to be updated whenever a new typeshed version is released, stubs for a specific third-party package should be published automatically when it changes.

Possible Implementation

Add a generic setup-tp.py to typeshed that takes its package name from the directory it's in and uses the current date and time as version number.
Amend the CI process for master only so that after a successful test run, for every third-party package that was changed since the last successful run, the following is done automatically:
1. Copy setup-tp.py into the third-party module directory as setup.py.
2. Build the package in that directory.
3. Upload to pypi.

ethanhs commented 5 years ago

I haven't had time to think over the full proposal yet but I did want to correct a misunderstanding and add some relevant information.

First a minor (but important!) nitpick: PEP 561 is about package names. The things you install off of PyPi are distributions, which have zero or more packages. The example I always give to explain this is that you pip install Django but import django. django is the package, Django is the distribution.

Therefore, it would be possible to install, say setuptools-stubs, along with any number of other stub packages from the typeshed distribution. So we don't really need to squat names.

Speaking of squatting names, there has already been some discussion of this topic in https://github.com/pypa/warehouse/issues/4164. Donald Stufft is planning on namespacing packages in some form, so it is likely flask will end up with flask-stubs as a reserved distribution name anyway.

JukkaL commented 5 years ago

I like the proposal! Here are a few additional thoughts.

If we split (most) third party stubs to separate stub packages, type checkers could propose a relevant stub package to install if they can't find stubs for some package. This would be easy to do if third party stubs are managed centrally, but hard if they are spread out across numerous repositories.

I think that it would make sense to request permission proactively for the most popular packages on PyPI to be included on typeshed (I mentioned this in #2440). This would make it easier to contribute new stubs, as asking for permission is something many people are uncomfortable doing. I'm happy to try doing this if others think that this is a good idea.

I think that review workload is the biggest potential issue. I'm ready to volunteer to do more code reviews if we can pull some of the proposed changes off, in part because I expect they could well make code reviews easier. I have some ideas about making core review workload manageable below.

First, I think that it would be a big help if we had tests for third party stubs. At the very least, we should have a .py file that exercises the most important functionality in the stub. A test would pass if mypy (or some other tool) could type check the test file without errors. This can be refined later, but I think that this would be sufficient in the beginning.

Second, if we have support for tests, we can require that any new third party stubs have decent test coverage. Also, most fixes to a stub should include a test. Code reviews will be simpler if we can trust that tests catch any egregious errors at least. Also, the presence of tests makes it more likely that the PR author has done a reasonable job, thus hopefully requiring fewer review iterations. We can ask the creator of a new stub to actually verify that the test file can be run against the library (not just checked). It's probably impractical to execute tests in typeshed CI, though.

Third, since third party stubs wouldn't be installed by default, errors in those stubs would be less serious than now, as it would be easy to uninstall the stubs if there are problems, and it would be easy to revert back to an earlier stub version. Reporting bugs in stubs back to typeshed would perhaps also be simpler, as it would be easy to attribute blame to a certain typeshed stub package. Due to these reasons, code reviews could be less strict for third party packages (i.e. no need to double check against the implementation if something is unclear), excepting maybe the most popular packages. Here my rationale is that it will be much easier to encourage users to contribute small fixes to not-very-polished stubs than to create a new set of stubs from scratch. Automatically having the fixes available on PyPI immediately after the PR has been merged could be another big motivating factor.

Finally, maybe we should consider requiring the use of a code auto-formatter for stubs (at least new ones), if there is one that works for stubs. Or we could maybe introduce more aggressive linting for new stubs. These would improve the readability of contributed stubs.

ilevkivskyi commented 5 years ago

There are few questions I wanted to clarify about the proposal:

What to do with stubs for large frameworks that require plugins to work correctly? (SQLAlchemy, Django, NumPy, etc.)
How are we going to manage the versioning of stub packages?
Will incomplete stubs be allowed? Will we require __getattr__ for such to avoid false positives, or just not allow them at all?

JukkaL commented 5 years ago

What to do with stubs for large frameworks that require plugins to work correctly? (SQLAlchemy, Django, NumPy, etc.)

I think that these could be decided on a case-by-case basis. If the stubs don't generate false positives without a plugin, maybe the stubs can be included in typeshed. We don't need to include all stubs in typeshed, but I think that it's a good place for the "long tail" of stubs where nobody may be motivated enough to set up a separate repository just for stubs.

How are we going to manage the versioning of stub packages?

I proposed that we could automatically generate a version number from the last modified date.

Will incomplete stubs be allowed? Will we require __getattr__ for such to avoid false positives, or just not allow them at all?

In my opinion __getattr__ should be acceptable for large or complex packages, but we should try to avoid it for small and medium sized stubs.

JelleZijlstra commented 5 years ago

Finally, maybe we should consider requiring the use of a code auto-formatter for stubs (at least new ones), if there is one that works for stubs. Or we could maybe introduce more aggressive linting for new stubs. These would improve the readability of contributed stubs.

https://github.com/ambv/black supports auto-formatting stubs. (I added the support.)

sproshev commented 5 years ago

I proposed that we could automatically generate a version number from the last modified date.

It means that version number would have no connection with runtime package version and non-descriptive as a result. Let's imagine some environment requires numpy=1.14.1 to be installed, how a developer will figure out what stub package version to use?

sproshev commented 5 years ago

We at PyCharm met this problem while developing stub packages advertiser. There is no way to determine what stub package version fits installed runtime package without installing them from the newest one to the eldest.

JukkaL commented 5 years ago

@sproshev Fair point. I'm not sure if there's anything we can do that would always work reliably.

We could perhaps include the package version in typeshed as metadata. The stub package version could be derived from that (and incremented by 1 for each update). For example, the stub package versions for numpy 1.14.1 could be something like 1.14.1, 1.14.1+1, 1.14.1+2, etc. However, in practice it would be quite likely that stubs wouldn't fully conform to any package version. Maybe the version would only be a best-effort hint, and would roughly correspond the latest package version that is known to have some support by the stub (i.e. works in a reasonable fashion, but is not necessarily complete).

I don't think that we need anything perfect here. I'd be happy with an approach that works well for simple cases (most modules probably have pretty simple interfaces) and supports gradual refinement for more complex cases. The key would be making it easy for users to contribute. Crowdsourcing seems the only feasible approach for stubs.

Other ideas for guidelines:

We'd normally only actively maintain one set of stubs per package (the latest supported). Stubs for older versions would only be available on PyPI (and git history). Maybe we can make exceptions for certain very popular libraries if they break backward compatibility without changing the package name, but hopefully this is pretty rare.
For existing stubs we should perhaps initially pick some arbitrary version (such as 0.0.1) until somebody validates which version the stubs best conform to.

ethanhs commented 5 years ago

Okay, so after thinking this over, I decided I really like this idea. I definitely think it would be a good idea to reach out to the typescript folks and hear about their experiences, since they likely have already hit problems we will in executing this plan.

As for versioning, this is actually covered in PEP 561. Essentially, the stub package declares which version(s) of the runtime package it supports in the install requires. Therefore we should be able to tell what package fulfill requirements based on the requires_dist metadata in a package.

At worst, you have to download (but not install!) several wheels. I think this gets a lot easier if we work with the warehouse folks to make sure the metadata we need is available through their JSON API (some if not all of it already is).

With regards to partial packages, I would say people can start with their own incomplete packages, and once they pass a minimum bar, they can be brought into typeshed.

As for plugins, I don't really see an issue with including them, because they are not likely to cause issues with other type checkers, and they don't take that much disk space. Alternatively, we could do some packaging magic and make a separate plugin package that gets installed if you ask for that extra (e.g. pip install pkg-stubs[plugin]). That wouldn't be too painful as we can automate a lot of it.

Lastly, and probably most importantly, reviewing the stubs. I am also happy to step up and do more review as well. I also think pulling something like the mypy test suite out of mypy and making it more generic is a good idea, as is requiring new packages/improvements to be tested.

Related to reviewing, I have been inspired by Marietta to hold "office hours". The idea is to help people with static typing and PEP 561 packaging for probably about an hour every week. I am hoping to start holding hours sometime this month. If there are others who want to do this too perhaps we could coordinate.

srittau commented 5 years ago

For reference, better tests are discussed in #754.

gvanrossum commented 5 years ago

+1

srittau commented 5 years ago

I have created a new branch third-party-dist for code related to this issue and submitted a first pull request (#2545) against that branch that adds a script to build wheels for individual third-party packages.

srittau commented 5 years ago

I suggest we start adding METADATA files to third party stubs that for now only contain a Requires-Dist field per @ethanhs. This could then be merged into the wheel's METADATA file on build. This would mean we also need to convert simple modules into packages (itsdangerous.pyi to itsdangerous/__init__.pyi), but the build script needs to do that anyway.

Another question is how strict the version specifier is supposed to be. For example, if a package uses semantic versioning and the stub was written for version 1.4.1, what should we use? Possibilities are:

>= 1, < 2
>= 1.4, < 2
>= 1.4, < 1.5
== 1.4.1

I would suggest using the current minor level as lower bound, since using the stub with older versions will not catch some problems, such as using API additions from newer versions. On the other hand using the next major version as upper bound seems fine. This could give false positives when using API not yet supported in the stub, but you can use newer versions without using newer API and it can be encouragement to contribute to typeshed. So the second example above.

srittau commented 5 years ago

An additional idea about the generated version number. Currently the build script uses 0.YYYYMMDD.HHMM. We could add a custom field Stubs-API-version or similar to METADATA, defaulting to 0.0. This will then get preprended to the date/time in the stub package version number. In our example above, this version would be 1.4 and a generated version could be 1.4.20181026.0815. This would allow users to add this to the their requirements.txt:

foo == 1.4.1
foo-ts >= 1.4, < 1.5

bluetech commented 5 years ago

Some questions (mostly for @srittau's build-dist.py script):

Does it allow dependencies between stub packages? There are cases where package A's stubs wants to refer to package B's stubs.
IIUC the proposed name for stub packages is $package-ts. Leaving bikeshed aside, how would "squatting" be prevented? As far as I know, PyPI doesn't support package namespaces or reserving a range of names *-ts.

I also have a comment regarding the structure of the repository once every third party stub is a package. The current structure is third_party/{2,3,3.5,2and3}/$package and the script parses the path and writes Requires-Python and trove classifiers accordingly. In my opinion it would be better to do away with the 2,3,3.5,2and3 part and move that into the regular package metadata instead. This way the location for a package is consistently third_party/$package (and this can also be set in the Home-page field).

srittau commented 5 years ago

One thing I'd like to do is "METADATA merging". I think it makes sense to have a METADATA.tmpl file that provides sensible defaults for stub packages. But this could then be amended with a per-package METADATA file that adds or overwrites the default fields. This METADATA file could then express dependencies.

I also agree that moving all packages to the third_party directory and removing the Python version directories would be good. But we can only do that after the type checkers have stopped distributing third-party stubs, otherwise we would break them.

Regarding the suffix and squatting: This is something that we have to check with the warehouse maintainers. They might have opinions and ideas about that.

bluetech commented 5 years ago

METADATA merging

This sounds good to me. Although I think outright duplicating the file in every package should also be considered. That would be more direct and easy to understand, and since the packages are managed centrally in a single repository, every mass change can simply be applied at once across all of them. And it also allows for gradual (intentionally non atomic) changes if that's ever needed.

But we can only do that after the type checkers have stopped distributing third-party stubs, otherwise we would break them.

Are there currently any type checkers that install typeshed dynamically rather than vendor or pin it? If they all vendor it, they only need to adapt when they update typeshed, no?

srittau commented 5 years ago

Not a mypy expert, but from my understanding at least mypy is vendoring typeshed using git submodule. Changing it now would break mypy's build process and would need quite a bit work to fix, for what is a temporary situation.

srittau commented 5 years ago

I have now reached out to the pypa team in pypa/warehouse#4967.

ethanhs commented 5 years ago

@srittau Mypy uses typeshed as a submodule, but submodules are pinned to certain commits (which we manually update). The tests within typeshed would probably be quite broken since those use the master branch of typeshed instead of the submodule, but I expect changing the format of third_party would require corresponding lock-step changes in both pytype and mypy anyway.

As for package versioning, I think there are two fundamental features we want from a version:

1) major version: This allows us to differentiate and support different major versions of big packages. (DefinitelyTyped has different folders for packages with two major versions)

2) some incremental patch version: I think type stubs are likely to be some best effort approximation of the latest minor version (since I don't think anyone wants a different copy of the stubs for each minor version). Therefore, we can increment a patch number, so that if there is an error in the stubs that is serious or breaks something, people can role back to an earlier release.

Therefore I propose just MAJOR.patch, where MAJOR is not changed.

srittau commented 5 years ago

I think I expressed myself wrong about mypy: It's less about the build process, but more about the runtime behavior. From what I understand, mypy (and I guess pytype too) looks into the versioned subdirectories inside of typeshed to find the appropriate stubs for a Python version. If we'd reshuffle the third-party directory at this point, the type checkers would not find the third-party stubs anymore. Therefore, I'd propose the following timeline:

Automate this process and automatically upload third-party packages to pypi.
Wait for type checkers to drop third-party packages from their distribution.
Restructure the third_party directory.

ethanhs commented 5 years ago

I understand what you mean, I was trying to say that it won't break anything (using mypy) right now, since we are pinned to an older commit. We will just have to handle the restructuring before updating the pin next time (usually happens every few weeks or so). But if you want to delay the restructuring that is fine too.

srittau commented 5 years ago

List of tasks to make this a reality:

[ ] Install a test instance of warehouse.
[ ] Write the CI script that builds and submits changed packages. I looked into this and this should be doable within travis-ci.
[ ] Add namespace support to warehouse. This is the biggie. (pypa/warehouse#2589)
[ ] Merge Python 2 and 3 versions of itsdangerous. (#2564)
[ ] Merge Python 2 and 3 versions of six.

We also need to decide on a namespace for typeshed stubs. I'd suggest just using types., similar to what DefinitelyTypes is using for two reasons:

It is an easy name to remember for users as opposed to typeshed. or ts.
It clarifies that we aim to be a central repository for types.
Using the same name as typescript makes it harder to use a wrong name for users of both.

ethanhs commented 4 years ago

I want to move forward with this, and I don't want to wait for namespacing to be implemented in warehouse (and don't have the time to implement at the moment). I think it is okay if we do types-* as that is what Donald Stufft suggested.

I have a proposal based on a discussion I had with several people on the mypy team (@JukkaL please feel free to clarify or correct anything here if I got it wrong):

Package the stdlib as its own package. See PR I just opened #3656. I already registered types-stdlib on pypi.
Split the third party stubs into their own directories. The directory structure would look like:

stubs/
    requests/
        __init__.pyi
        2/
        3/
        METADATA.toml

The METADATA file would specify:

stub package version. This would be something like PACKAGE_VERSION.TYPESHED_INCREMENT, where PACKAGE_VERSION is the minimum supported runtime package supported by the stubs, and TYPESHED_INCREMENT is a bump every time the stubs are modified. Stubs can have types that are in newer versions, but cannot have breaking changes without bumping the PACKAGE_VERSION.
requirements. This allows stubs to rely on each other.

As for how to actually format the packaging, I think we should generate up to 3 packages:

types-requests, which is either requests/2and3 (if it exists) or if there are separate 2/3 folders, it will select at install time which to install based on the Python installing it.
types-requests-py2 which is requests/2 (if it exists)
types-requests-py3 which is requests/3 (if it exists)

This packaging scheme was chosen so that people get what they want in the simple case (stubs compatible with the Python they are installing with), but if they want to manually install Python 2 stubs on Python 3 they can.

As for a transition plan, in our discussions mypy would just require (in setup.py) the stub packages that are in typeshed already so people's code doesn't break. We would probably require and pin stdlib for now.

I think much of this infrastructure can be based off @srittau's previous work.

asvetlov commented 4 years ago

Eventually all third-party stubs become types-<name>-py3, right? py3 suffix seems redundant. Would you consider -py2 for Python 2, -py23 for Python 2 and 3, and just empty for Python 3 only stubs?

srittau commented 4 years ago

Thank you for moving forward with this! I can offer what limited time I have to help if you need anything. A few comments:

I agree with asvetlov that Python 3-only packages should have no suffix as eventually this will be redundant. Also, if there ever is a Python 4, existing Python 3 packages will most likely continue to work, so this naming would be confusing.

Regarding TYPESHED_INCREMENT: I see problems with manually incrementing this number, depending on how we publish releases:

In case of automatic publishing after each change, either each PR would need to include this increment, which would mean a lot back and forth with unaware contributors and possible conflicts if multiple PRs want to change the same third-party stubs. Alternatively, a GitHub Action could automatically increase this number after each merge, but this would add unnecessary commits to the history.
In case we want to publish the stubs manually, we need to go through all changes to find third-party stubs that have changed and increase their version number.

I think the best approach is automatic publishing after changes and using the current date/time as TYPESHED_INCREMENT as part of the build process.

srittau commented 4 years ago

I accidentally pushed 0b7f7b4215abc4d902ef67ed53e8299a5bd4cb15 directly into the third-party-dist branch instead of opening a PR. Could I get a quick ad-hoc review here? Sorry and thanks!

ethanhs commented 4 years ago

Yeah I think @asvetlov is right, we shouldn't have a suffix for python 3 packages. In that case perhaps we just have the suffixes mirror the current folder layout (-2 and -2and3), but omit it in the case of Python 3?

RE versioning perhaps just using dates for the TYPESHED_INCREMENT would be better, I didn't realize that was compatible with PEP 440.

srittau commented 4 years ago

Another thing that didn't come up yet was the difference between package and distribution names (I hope I got the terminology right). For example the pillow distribution contains the PIL package (because it started as a drop-in replacement for PIL). pycrypto contains the Crypto package, other distribution might contain multiple packages, and multiple distributions could contain packages with the same name, but different APIs. I think it would make sense if our package structure could reflect that, for example by using the distribution name as top-level directory, which contains third-party metadata and the stub packages as sub-directories.

JukkaL commented 4 years ago

I'd love to get rid of the the version-dependent subdirectories under stubs/package in the common cases. In the long term, I'd expect that the vast majority of stubs will only have either the 3 or 2and3 variant. Separate Python 2 and 3 stubs are mostly historical, I think, and there likely won't be many additional Python 2 only stubs.

Here's my proposal that would get us there:

By default, the stubs directly under stubs/package would be Python 3 only (as that will become the norm in the long term).
If stubs are also valid for Python 2, the stubs still live directly under stubs/package, but metadata would indicate that minimum Python version is 2.7 (the implicit minimum Python version would be 3.0 or 3.4, perhaps).
If there are separate stubs for Python 2 and 3, the stubs for Python 2 would live under stubs/package/2, while the Python 3 stubs would be under stubs/package. (I don't expect there to many additional packages like this in the future.)
If there is only Python 2 stubs, they would live under stubs/package/2, and there would be no stubs directly under stubs/package.

This would have two main benefits (in the long term):

In the vast majority of cases, the "main" stubs could be found in stubs/package, without having to look for which subdirectory is the right one. This would always contain the Python 3 stubs.
There would be fewer special directory names (only 2).

I think that I'd also prefer that package names had only two options: types-requests (for Python 3 and possibly also Python 2) and types-requests-py2 (for Python 2 only). The motivation is similar:

Fewer options (2 instead of 3).
In the long term when everybody has migrated to Python 3, types-foo without a suffix would always be the right thing, even if it still supports Python 2 (consistency!).

srittau commented 4 years ago

My suggestion for a file system structure would actually look like this, integrating Jukka's suggestions (using pillow here to distinguish between distributions and packages):

stubs/
    Pillow/
        METADATA.toml
        PIL/
            __init__.pyi
            ...
        2/
            PIL/
                __init__.pyi
                ...

This would allow us to have both stubs for Pillow and PIL with differing metadata, even though both use the same package namespace. Not the best example, because PIL is basically obsolete, but sufficient to use as an example.

gvanrossum commented 4 years ago

So if the distribution and package name are the same, would the name be repeated, e.g. stubs/requests/requests/init.pyi? Or in that case stubs/requests/init.pyI?

srittau commented 4 years ago

For consistency I think it should be, even if it is a bit awkward.

JukkaL commented 4 years ago

I think that repeating the package name would be unfortunate. I'd go as far as saying that we should only support one set of stubs per package name, for simplicity (potentially with multiple versions, but still targeting the same package). In the case of PIL, we'd pick the most popular package and only include stubs for it.

If we'll ever have a case where there are two separate packages with the same name, both of which are sufficiently popular, we can come up with a special case mechanism to deal with that. I'd rather not add an extra level of nesting everywhere for a very rare special case (that may ultimately not even matter at all).

The deep level of nesting is one of my pet peeves about the current typeshed directory structure.

As in my previous comment, my primary motivation is to keep the common case simple. Having exceptions for rare cases seems better in my opinion than making the general case more complex to facilitate rare cases. Both imply some complexity, but in the prior case most of the complexity can be ignored most of the time. Also, we can postpone adding the complexity until there is a compelling need, instead of doing it upfront.

ethanhs commented 4 years ago

I agree with what Jukka says above. So I think the current proposal is:

stubs/
    distribution/
        packages
        METADATA.toml (indicates 2, 2and3, 3)

~Where a 3/ or 2and3/ folder becomes types-distribution and a 2/ folder becomes types-distribution-py2.~ We can publish types-distribution or types-distribution-py2 based on the metadata.

Also, I would like to move forward with https://github.com/python/typeshed/pull/3656, as I don't think it will be affected by what we decide on here. I think this would be a good first step in making typeshed modular, and carries little risk.

srittau commented 4 years ago

I still see this structure as problematic. Recent example I came across: It wouldn't be possible to create an accurate package for pytest, which includes pytest and _pytest modules.

ethanhs commented 4 years ago

Yeah, on further thought, I think we will probably have to use distribution names.

srittau commented 4 years ago

I wonder whether we can ditch the "stubs" (or "third_party") part, though, to minimize nesting. Currently, the only directories we have are "tests", "third_party", and "stdlib". The only "special" directories we'd have are "stdlib" and "tests", while all other directories would refer to package names.

Edit: Probably not a good idea, considering how many directories there potentially are.

ethanhs commented 4 years ago

Yeah, while I don't like the nesting, it would be more bothersome to have all the packages at the toplevel IMO.

If we are going to combine all of the 2/3 packages into 2and3, I believe that means we will only ever have one of: Python 2 type stubs, Python 2 and 3 stubs, OR Python 3 stubs. If that is the case, I think we can get rid of versioning of packages, and just include in the metadata file which one the stubs are for. That way we can get rid of one level of nesting.

JukkaL commented 4 years ago

I and @ilevkivskyi will have some bandwidth to work on this in April/May. We discussed various options offline, and I can write a complete draft proposal next week (based on the discussion above and offline discussions).

It wouldn't be possible to create an accurate package for pytest, which includes pytest and _pytest modules.

To deal with pytest and _pytest, we could provide metadata saying that pytest includes _pytest, and that _pytest is an "internal" package and shouldn't get a dedicated PyPI stub package. So we'd have stubs/pytest/__init__.pyi and stubs/_pytest/__init__.pyi, but they'd be logically treated by the packaging tool as a single entity. Having multiple packages per distribution seems pretty rare, and adding an extra nesting level just for this use case seems not worth it to me.

If that is the case, I think we can get rid of [Python 2/3] versioning of packages, and just include in the metadata file which one the stubs are for.

I think that the supported Python version should be described in the metadata only. For stdlib we'll probably need a way to have separate stubs for Python 2 and 3, since there many stubs that haven't been merged, but I hope that this is unnecessary for third-party packages.

Yeah, while I don't like the nesting, it would be more bothersome to have all the packages at the toplevel IMO.

I don't think that it would be desirable to have all the packages at the repo top level, since if we are successful in growing typeshed, there could be thousands of packages there in the future, making it awkward to navigate the repository, and to find non-stub things such as tests and documentation.

ethanhs commented 4 years ago

So we'd have stubs/pytest/init.pyi and stubs/_pytest/init.pyi, but they'd be logically treated by the packaging tool as a single entity.

This seems like an unfortunate hack, and would probably confuse people who didn't know about it. If we kill the Python versioned folders, but add the distribution names, we would have the same level of nesting we do now, which seems acceptable to me.

JukkaL commented 4 years ago

Here's a new proposal that I came up with @ilevkivskyi. I tried to summarize various topics discussed earlier, but I may have missed something.

Why are we proposing this?

Monolithic typeshed makes typeshed updates difficult, as all stubs must be updated in sync.
Errors in stubs in a monolithic typeshed are serious, as all users will be affected. Error are more benign in a modular typeshed, as users can just avoid a bad package version.
Monolithic typeshed doesn’t easily scale to several hundreds or thousands of packages due to above issues.
Monolithic typeshed is inconsistent with PEP 561.

Expectations

In a 2+ year time frame, we expect these to be true:

A growing majority of stubs are for third-party packages, as opposed to the stdlib.
Most new stubs only support Python 3, as Python 2 has reached end of life.

In a 5-year time frame, these seem possible:

There are stubs for over a thousand third-party packages in typeshed.
Python 2 is no longer supported.

From these assumptions, these things follow:

Optimize for the case of adding third-party package that supports Python 3 only, while supporting other use cases as well.
Dropping support for Python 2 shouldn't require any major reorganization of the repository.

Structure of third-party stubs

Store stubs under stubs/<distribution>. Each distribution-specific directory can contain one or more .pyi files and/or stub packages. Each directory must also contain a METADATA.toml file.

Example:

stubs/requests/METADATA.toml
stubs/requests/requests/__init__.pyi
...
stubs/PyYAML/METADATA.toml
stubs/PyYAML/yaml/__init__.pyi
...
stubs/typing-extensions/METADATA.toml
stubs/typing-extensions/typing_extensions.pyi
...
stubs/pytest/METADATA.toml
stubs/pytest/_pytest/...
stubs/pytest/pytest/...
...

We'll have separate PyPI stub package per distribution, named <distribution>-stubs.

Third-party stub metadata

Only one piece of metadata is required, version. This is an approximation of which version of the distribution the stubs correspond to. If unknown, this can be left 0.0.1 (but still needs to be specified). There is a gray area where there are no hard rules, such as stubs that mostly correspond to acme 2.0 but have some 2.1 and 2.2 features (but not all of them).

Optional items include these (in METADATA.toml):

python2 = true/false

Does this support Python 2? Default: false

python3 = true/false

Does this support Python 3? Default: true

requires = [list of stub packages]

Stub package dependencies. Only stub packages included in typeshed can be depended on (for security). These can also contain version information, similar to requirements.txt (e.g. requires = ["acme-stubs>=2.3"]. Default: empty

We can later define additional metadata as needed. The above metadata is about the bare minimum.

Example:

python2 = true
requires = ["acme-stubs>=2.3"]

Stdlib stubs

The standard library stubs are shipped as a single package, and they follow a different structure due to legacy reasons. They can also be bundled inside a type checker distribution.

The only metadata for stdlib is version information: minimum Python version for each package. This file is named VERSIONS and follows structure like this:

foo 3.7
pkg 2.7

Example:

stdlib/VERSIONS
stdlib/foo.pyi
stdlib/pkg/__init__.pyi
...

Some stubs have separate versions for Python 2. These will be put in the directory stdlib/2.7 (we use a dot as it's not valid in a package name):

stdlib/2.7/foo.pyi
stdlib/2.7/pkg/__init__.pyi
...

If a stub is only available for Python 2, it will only be stored under 2.7. So the top-level stdlib/ directory contains stubs for Python 3 only and stubs that support both Python 2 and 3. The motivation is that the Python 3 stub for arbitrary stdlib package foo is now consistently at stdlib/foo, unlike the current situation where it can be in multiple locations. I expect that the significance of Python 2 will wane in the next year or two.

Publishing changes to stubs

After each change to stubs, we'll automatically publish a new version of the relevant stub distribution. Also, if stubs for a new distribution are added, we'll automatically create a new PyPI distribution.

The version of the stub file is derived from the version metadata field. Initial release is named <distribution>-x.y-0, and we increment the final number on each update. For example, we can start with acme-2.0-0, followed by acme-2.0-1 and acme-2.0-2, followed by acme-2.1-0, and so on.

The next version number is determined dynamically by querying PyPI to avoid having to store additional per-distribution state.

Security considerations

We'll set up a separate repository where all the scripts related to publishing packages on PyPI are stored along with the required upload key (the latter is of course hidden somehow and not stored in a git repository). Only a restricted number of typeshed maintainers will have access to this. In the future we may want to increase the number of typeshed maintainers, and in that case not every maintainer needs access to PyPI and build scripts.

The upload scripts will ensure that the packages only contain stub files (.pyi) and no executable code.

Other ideas:

The upload will commence only after a delay of 5-10 minutes to allow the maintainers to react in case something unexpected happens.
Should we throttle the uploads so that only, say, at most 50 updates can be uploaded per 12 hours?

Implementation plan

Get agreement on the plan within typeshed maintainers.
Notify tool maintainers (such as PyType, Pyre, PyCharm) and ask for feedback. Address major concerns (whenever feasible). This period should last at least two weeks.
Migrate existing stubs to the new structure.
Write the stub upload script but don't upload packages yet.
Implement support for the new approach in mypy. Test the package support with mypy and fix any issues.
Upload the initial set of packages.
Manually run the upload script daily for a while, to validate that it works correctly.
Automate package uploads.
Public release of mypy that supports the new approach. (And other tools, hopefully.)

Possible extensions

I'm just giving a few examples of things that we could build on top of the proposal pretty easily. These would be discussed separately.

Testing stubs

Tests for stubs could added as test*.py files, such as stubs/requests/tests.py. We'd type check each test file, and make sure that exactly each line tagged with a # E comment will generate errors when type checked.

Example:

import acme

# This is expected to not generate an error.
acme.func(1, id="hoo")

# The following line should generate an error.
acme.func("x")  # E

Supporting multiple versions

We can support defining subdirectories such as stubs/acme/2.3 to also maintain stubs for an earlier version of a package. stubs/acme would contain the most recent version of the stubs, and any subdirectories would be for older releases that we still want to continue improving. The subdirectory would mirror the structure of the parent directory, but it would have a different version field in METADATA.toml (which must match the directory name).

We can also trivially support older versions of stubs, as long as older versions don't receive any updates, since PyPI packages are never deleted. The above approach allows releasing updates to older stub versions. This can be important if a package has incompatible API changes.

Additional metadata

Here are some ideas about metadata that could be useful:

Strictness options (to require annotations for all functions, for example)
Supported type checkers (if only works on some)
GitHub usernames of authors/owners
PyPI metadata (such as original author of the stub)

Open questions

How should we name the packages? Options include types-foo, foo-stubs and foo-ts at least. I don't have a strong opinion on this.
How to normalize distribution names, since pip supports many spellings?
Is <distribution>-x.y-N a good versioning scheme?

srittau commented 4 years ago

Thank you for writing this up, this looks very good. A few notes:

If just incrementing version numbers there is a slight chance of race conditions and outdated pypi caches. An alternative approach could be to have just one build per day, and use a date-based version number.
On a related note, something which you already alluded to are the rules for the version field: I think we should clearly recommend that only non-patch/bugfix parts of a version number should be included. E.g. for packages using semantic versioning, this would be X.Y.
Can we encode the upstream version number into our version number somehow? For example, x.y.stubsN or x.y.stubsYYYYMMDD (except that stubs is not allowed by PEP 440). This would allow users to have requires=["foo >= 1.2, < 2", "types-foo >= 1.2, < 2"] in their requirements. And in the case there is a known bug in older stubs: requires=[..., "types-foo >= 1.2.stubs20200403, < 2"].
The consensus for the package names was leaning towards types-*, this name coming from the packaging team. Using a prefix instead of a suffix means that we can eventually reserve the types namespace, when pypa implements namespaces, which is important for security.

gramster commented 4 years ago

Overall I like this proposal; it's a great step forward to addressing some of the limitations of typeshed in its current form. I think it's important to note that type stubs are useful beyond just the realm of static type checkers. In particular, editors are increasingly leveraging type stubs to greatly improve the editing experience in dynamic languages. I.e., apart from type checking, stubs can be used to provide autocomplete and other features. Visual Studio Code, Jedi, and PyCharm can all leverage type stubs to provide a better user experience. And there are a number of challenges in these environments where there is both source and stubs available. Problems editors that use stubs have to solve include:

how to map method/function stubs back to source files, so that things like 'go to definition' work
showing docstrings on hover (or at other times) - this is related to the above. If method signatures come from stubs then these have to be reconciled back to source to get docstrings, or some alternative is needed.
reconciling type stubs with existing but possibly incomplete type hints in the source code, or incomplete type stubs with methods in the code missing from the stubs
automatically identifying the best type stub version to use for a particular package version. Users shouldn't have to explicitly have to point to or install type stubs for the packages they use; tooling should be able to manage this automatically and transparently.

Type inference in dynamic languages is difficult and the results will always be mixed, so leveraging type stubs is a way to work around this, particularly in egregious cases. Some of these issues pertain to the contents of type stub files, and some affect packaging and version resolution.

So as this moves forward, when discussing with tool maintainers, editors that consume type stubs, and autocompletion engines like Jedi, should be included in the discussion.

JukkaL commented 4 years ago

@srittau Hmm, I think that race conditions are possible even when using date-based versioning, so we may need something else. Here are some ideas:

Wait 5 minutes before uploading. If there are any more recent changes to the same stub package during the wait, skip upload (the more recent package will take precedence). This way we won't have two uploads very close to each other, reducing the likelihood of race conditions.
If we incorrectly pick an already existing version number, perhaps we can detect this as the upload would fail? We can then increase the version number and try again.

I think we should clearly recommend that only non-patch/bugfix parts of a version number should be included. E.g. for packages using semantic versioning, this would be X.Y

Agreed, a good point.

Can we encode the upstream version number into our version number somehow?

I think that this should be possible, and quite useful. PEP 440 only seems to allow something like X.Y.Z, where X.Y are the first two components of the upstream version number, and Z is the auto-incrementing stub version, starting with 0. I'm okay with this, as long we document it clearly (the final number has no relationship with upstream version numbers).

The consensus for the package names was leaning towards types-*, this name coming from the packaging team.

Sounds good to me, I'm happy with this.

@gramster Good points. I'm planning to send this to typing-sig@, which should let us reach maintainers of several tools. We can also explicitly contact other projects that we know about. Can you suggest other tools that might be affected by this?

gramster commented 4 years ago

Mostly the tools I already mentioned. Visual Studio Code's language server and Jedi definitely. PyCharm possibly although I can't speak for what they do.

LouisStAmour commented 4 years ago

So I was going to make a new issue but I’ll join in on this one. I see a lot of reinventing the wheel here. Why not copy what works for https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/README.md as much as we can?

Basically, the rule of thumb is that however you install a package, you can install an identical version of the package with some prefix to get that version’s types, ideally as compatible with your runtime as the upstream version. We should ideally automate checking for types in upstream published packages and then remove publishing ours when upstream wants to take over, e.g. https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/README.md#removing-a-package (like Python package type stubs, you can publish typescript definitions in Node packages also).

Note that compatibility between TypeScript Compiler versions is an issue when publishing exclusively using upstream version numbers. https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/README.md#i-want-to-use-features-from-typescript-29-or-above highlights how eventually we’ll need a standard to not just support multiple versions of Python but also support multiple versions of Type Checkers. Part of the tests can be ensuring a supported matrix of versions of type checkers work correctly with each type definition before release.

Draft language and standard library standards are also published with a common prefix: https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/README.md#i-want-to-add-a-dom-api-not-present-in-typescript-by-default Once fully approved and released, the standard moves to a separate project where it is bundled and shipped in the type checker, and you can pick which libraries you’re targeting in a checker-specific configuration syntax.

The only thing I would adjust is the use of semver by DefinitelyTyped folks. I would use the upstream version number including patch, and I would increment a number attached afterward. Such as 1.2.0.0 or in more of a Linux fashion, 1.2.0-0. https://github.com/DefinitelyTyped/DefinitelyTyped/blob/master/README.md#how-do-definitely-typed-package-versions-relate-to-versions-of-the-corresponding-library

They have a lot of infrastructure but they could always use more automation given their scale. Here’s one way they separate infrastructure issues/incidents from other kinds of issues https://github.com/DefinitelyTyped/DefinitelyTyped/issues/44317

They have an automated system to maintain a GitHub CODEOWNERS file broken down by project: https://github.com/DefinitelyTyped/DefinitelyTyped/pull/44417

A bit handles tagging and facilitates the review process https://github.com/DefinitelyTyped/DefinitelyTyped/pull/44444

Third-party contributors have written their own bots: https://github.com/DefinitelyTyped/DefinitelyTyped/pull/44096

And so on...

LouisStAmour commented 4 years ago

One more comment on the above, if multiple branches are maintained upstream, they introduce subfolders for older branches and again publish off master. So for example, D3 has https://github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types/d3 with 5.7.x at the root level and subfolders for maintaining v3 and v4, currently 3.5.x and 4.13.x respectively.

Some things don’t semver, at which point they invent their own minor release number to upstream’s major release number. https://github.com/DefinitelyTyped/DefinitelyTyped/pull/40301

The point is that authors ideally will get the latest types available that are compatible with the major-minor version they’re using. And as I said earlier, I’d keep the upstream patch version too, on the off chance someone gets semver wrong upstream.

LouisStAmour commented 4 years ago

One last post for the night, the TS folks support external dependencies on other npm packages: https://github.com/DefinitelyTyped/DefinitelyTyped/issues/36575 and https://github.com/microsoft/types-publisher/blob/ac63cb9006c509ef56db4363651dde48dc956944/src/lib/definition-parser.ts#L260

Normally referencing other types looks like https://github.com/DefinitelyTyped/DefinitelyTyped/blob/5344bfc80508c53a23dae37b860fb0c905ff7b24/types/rx-jquery/index.d.ts#L7 and the result is published as https://www.npmjs.com/package/@types/rx-jquery with dependencies on other published types. Relying on specific versions might be out of scope as obviously you need to specify the same version for the published dependency type that your app is using. (Loading types for jQuery v3 won’t help you much if you’re still using v1...)

There’s a lot to dig in to here but the more I look the more I’m convinced that TypeScript’s model makes sense. Besides Python, the other unknowns or differences from upstream is having one TS compiler match both language and compiler version release, where we have separate language and type checker release numbers, and finally the concept of “plugins” is entirely foreign to the TS compiler, so either we make plugin support something optional (if available a type stub can use it, if not it has to provide an alternative or some kind of warning?) or we skip publishing any plugin-based stubs assuming users will get them bundled with the plugin. Either way we will want to support multiple versions of syntax/checker so publishing one package that supports multiple checkers, with or without plugins somehow included, would be ideal. That way to end users they install the upstream module and only if it doesn’t have types, would they then install one of our packages and ideally that’s it. Our package could theoretically include the plugin, if the security model of the checker allows for it to. That said I’ve no experience with checker plugins so I might be completely wrong here, I’m just talking end-user UX for now.

LouisStAmour commented 4 years ago

I've also detailed some of my thoughts on what Python could learn from TypeScript at https://gitter.im/python/typing?at=5eafad1a22f9c45c2a6a6e97

python / typeshed

Explore building third party stubs as packages #2491

Advantages

Issues

Further Considerations

Possible Implementation

Why are we proposing this?

Expectations

Structure of third-party stubs

Third-party stub metadata

Stdlib stubs

Publishing changes to stubs

Security considerations

Implementation plan

Possible extensions

Testing stubs

Supporting multiple versions

Additional metadata

Open questions