pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.74k stars 17.95k forks source link

Typing Stubs and PEP 561 compatibility #28142

Closed simonjayhawkins closed 1 year ago

simonjayhawkins commented 5 years ago

xref https://github.com/pandas-dev/pandas/pull/28135#issuecomment-524659775

do we want to make pandas PEP 561 compatible?

https://mypy.readthedocs.io/en/latest/installed_packages.html#making-pep-561-compatible-packages

jbrockmendel commented 5 years ago

If we go down that road, we could consider using stub files in cython to make that code valid python (which would allow us to lint)

zero323 commented 4 years ago

@WillAyd @jbrockmendel @simonjayhawkins How would you describe current state of things? I've seen that many core components already have pretty good API coverage. Is it realistic to expect this to happen in foreseeable future (let's say next release or two)?

To explain the context of this question ‒ for the last three years I've been working on stub files for Apache Spark. Over this time contact surface between Pandas and PySpark grown significantly, mostly due to introduction and active development of so-called Pandas udfs. Since Pandas doesn't advertise its annotation it effectively creates a growing gap, in practice not covered by type checkers. Additionally the latest upstream developments utilize type hints in Pandas-dependent components, leading conflicts between static type checking, and upstream runtime requirements.

Furthermore lack of actionable annotations leads to rather ugly escalations in case of polymorphic functions, which accept Pandas objects, as well as other types,

Now... For some time, to partially address the problem, I've been using Protocols and dummy compatibility imports. The idea is basically to:

This approach is not without its own problems, but does the trick. If Pandas is going to PEP 561 these will become obsolete, but if such move is not going to happen any time soon, I will consider formalizing this approach, and agitating for required adjustments in core PySpark. However, given amount of red tape, I'd really like to avoid it :)

WillAyd commented 4 years ago

@zero323 are you asking if we plan on distributing stub files? If so I would say no for now; we have been annotating inline. Not fully complete but you'll see issues like #26766 used to track that progress

zero323 commented 4 years ago

Thank you for the answer @WillAyd.

I didn't mean stubs explicitly, but annotations that are visible to external checkers (which tend to assume no annotations, if no stubs packages or py.typed files are present).

For me it is basically a decision between waiting a bit longer (if these are expected to be ready soon. From that perspective even dynamic annotations are good enough) or adding elaborate workarounds.

mfcabrera commented 4 years ago

As an end user I am also having the use of annotationg code with say pd.DataFrame and mypy interpreting it as Any.

I am bit confused here. in order to distribute the annotations isn't that enough to add py.typed file and modify the setup.py? That way mypy will be able to use the inline annotated code right? or there is something I am missing?

Update: Sorry I did not see https://github.com/pandas-dev/pandas/pull/28831

However I am bit still confused on what is need to make this happen (if it is happening).

zero323 commented 4 years ago

I am bit confused here. in order to distribute the annotations isn't that enough to add py.typed file and modify the setup.py?

There is a bit more to that. In general it is good to have annotations that pass type checks with standardish settings, otherwise you're likely to break things downstream.

OliverSieweke commented 4 years ago

Related issue: #12412

Dr-Irv commented 3 years ago

Need to keep an eye on Microsoft's efforts here: https://github.com/microsoft/python-type-stubs/tree/main/pandas

gramster commented 3 years ago

Regarding our (Microsoft) stubs:

* I believe that if the stub files exist in the package distribution alongside the Python files, then we can use them, and they will take priority over the .py files (@msfterictraut could confirm). So that would be one way we could handle an incremental move to inline types. But it would require package authors to buy into that approach.

erictraut commented 3 years ago

@gramster, you're correct that stub files that are packaged alongside the Python source files will take precedence when used by a type checker like pyright or mypy. This is the behavior dictated by PEP 561.

Until recently, library authors had no way to verify the completeness of their type information within a "py.typed" package. We recently added a mode to pyright that analyzes a package and reports missing or incorrect type information. For more details, see this documentation.

simonjayhawkins commented 3 years ago

also need to keep an eye on https://github.com/cython/cython/pull/3818

jorisvandenbossche commented 3 years ago

@gramster or @erictraut could you give some background on the "content" of the Microsoft stubs? (since we also already have type annotations in pandas as well) Are they much more elaborate than the type annotations we have in pandas? Or did you copy the type annotations for functions in pandas that are already typed, and added stubs for things that were not yet typed in pandas? (to have a bit an idea how divergent they are or how easily/difficult they could be combined somehow)

gramster commented 3 years ago

We added a lot more annotations, and becuase these are stub files, included annotations for a lot of APIs implemented in Cython.

TomAugspurger commented 3 years ago

Thanks to all the folks who joined the dev call on Wednesday. Here's a recap of (my understanding of) the current status, and a tentative proposal for how to proceed. More notes at https://docs.google.com/document/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?usp=sharing.

https://github.com/microsoft/python-type-stubs/tree/main/pandas contains type stub (.pyi) files for pandas. Microsoft would like to upstream them to pandas:

There weren't any objections from the pandas side to taking on maintenance of the stubs. Most of the discussion was around logistics of actually upstreaming them.

pandas is already using inline types wherever possible. I think there's broad agreement that we want to continue that wherever possible (i.e. everywhere but extension modules). However, the mere presence of a .pyi file means that (at least by default) the inline types will be ignored by type checkers for that module. So we don't want to just dump the contents of that repo into the pandas package. That would essentially nullify our inline types which we've been building up over the past year or two.

My suggestion is to manually go through, file-by-file, and migrate the typestub files to inline types. This will require some effort by maintainers who are experienced with typing. Especially early on, there might be some helpers that need to be migrated first before later files can integrate them. My hope is that once we've ironed out the process, we can engage pandas' pool of volunteer contributors to do the long tail of files.

Finally, we'll need to coordinate with the microsoft/python-type-stubs team once we've upstreamed individual files. Perhaps a checklist on a GitHub issue, or a google spreadsheet. Once a module has been upstreamed to pandas, we'd like to avoid changing it (just) in microsoft/python-type-stubs.

One thing I'm not sure of: what should happen once pandas has completely upstreamed all the files? We'd like the inline types to take precedence by code-completion tools in editors, so they should probably be removed for microsoft/python-type-stubs. But the pyright maintainers mentioned some ancillery benefits to .pyi files: They can also be statically parsed for docstrings (which isn't possible for things like read_csv that are dynamically generated). And they're often faster to parse since they're just the types / docstrings, not the code. Perhaps we can regularly regenerate the in microsoft/python-type-stubs from the types packaged with pandas?

topper-123 commented 3 years ago

One small point is that read_csv is no longur dynamically created. read_table is also a normal function.

I don't know if that changes the position of the pyright maintainers, or if this a more general problem...

gramster commented 3 years ago

I think for us, we will either come up with an alternate solution to the docstring problem or regenerate full stubs from the pandas source and ship those with docstrings added. We’d obviously prefer that docstrings not be dynamically created but we have workarounds if they are.

On Fri, Jan 15, 2021 at 6:15 AM Terji Petersen notifications@github.com wrote:

One small point is that read_csv is no longur dynamically created. read_table is also anormal function.

I don't know if that changes the position of the pyright maintainers, or if this a more general problem...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/28142#issuecomment-760965958, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVCPCDNSVMM33LJF7OB65TS2BEW3ANCNFSM4IPKIX7A .

Dr-Irv commented 3 years ago

Let's make sure that we track the two use cases of typing:

  1. Typing for pandas developers - making sure that our typing of internals (and maybe externals) of pandas code works with mypy as is tested in the CI today
  2. Typing for pandas users - providing type hints for pandas users

For (2) we need to figure out a way to test that we have properly typed the public API. The work that @gramster has done provides the typing for much of the public API, and that's what we want to integrate in. I still don't know how we create a set of tests for CI that help us verify that the typing of the public API is correct.

One thing we could consider doing now is to add a py.typed file at the top of the pandas distribution, which would mean that type checkers like mypy would look inside the pandas code to check user code. (I verified this with a simple test)

@gramster if we were to do that, how would pyright handle type checking for users with the presence of the type hints in the pandas distribution (with the py.typed file present), plus the type hints that you are shipping with pyright? Would the latter just be ignored because of the presence of the py.typed file in the pandas distribution? Or if a hint wasn't found in the pandas distribution, then the stubs shipped with pyright would be checked?

TomAugspurger commented 3 years ago

For (2) we need to figure out a way to test that we have properly typed the public API. The work that @gramster has done provides the typing for much of the public API, and that's what we want to integrate in. I still don't know how we create a set of tests for CI that help us verify that the typing of the public API is correct.

I'm not sure I understand the need for this distinction. From the point-of-view of mypy, why would anything in the public API be different? I think that either way we get a set of internally consistent types.

We do still have the problem of declaring an incorrect type (which IIUC are the types of bugs you reported to that repo). For that, I think we just have to rely on

  1. As we approach universal coverage (perhaps including in unit tests), those incorrect types may be exposed as internally inconsistent.
  2. User reports.
erictraut commented 3 years ago

I'm one of the primary authors and maintainers of pyright, and I work with Graham. Sorry I couldn't attend the recent meeting. I'm really excited about the discussion and the progress!

@TomAugspurger, to answer your question about what happens once pandas has completely inlined all of the type information from the stubs. I'm not worried about parsing speeds. Pyright can parse a file very quickly and create an in-memory AST. It performs all type evaluations lazily, using only the parts of the AST that are required to resolve a given type. So I'm not worried about the performance implications of moving to inlined types. (It is much more costly to infer return types when functions are not fully annotated, but if you fully annotate the public interface surface, that's not an issue.)

Dynamically-generated functions are a challenge, but we can work around those if necessary (e.g. conditionally import a stub or statically declare a symbol under a TYPE_CHECKING test).

@Dr-Irv, If you add a py.typed file, Pyright will still prefer stubs if they are present. The source would be ignored in this case. However, consider that py.typed generally implies that you have accurate and complete types within the package. You may want to wait until you've finished integrating the stub information before you do that.

I recently added a mode to the CLI version of Pyright that reports the "type completeness" of a package's public interface surface. It identifies all public symbols, evaluates their types, and reports which ones are missing type information (e.g. a parameter or a return type without an annotation). Several library authors have started to use this tool. It could be incorporated into a CI pipeline. For more details, see this documentation. I welcome any feedback and am willing to make changes and improvements based on your suggestions.

Dr-Irv commented 3 years ago

For (2) we need to figure out a way to test that we have properly typed the public API. The work that @gramster has done provides the typing for much of the public API, and that's what we want to integrate in. I still don't know how we create a set of tests for CI that help us verify that the typing of the public API is correct.

I'm not sure I understand the need for this distinction. From the point-of-view of mypy, why would anything in the public API be different? I think that either way we get a set of internally consistent types.

Let's take a method (property) like DataFrame.axes. Right now, we have a type declaration for that in the source. But if there is no internal pandas code that calls DataFrame.axes, then mypy wouldn't pick up if we had the declaration wrong. Now, if we ran mypy on our testing directory, then we might pick it up, but I think we avoid doing that now in the CI (please correct me if I'm wrong).

erictraut commented 3 years ago

I'll also point out that mypy will not analyze any code that is not annotated. That means it's blind to a lot of errors in unannotated code or code that calls unannotated code. Pyright has a bunch of sophisticated type inference logic and will analyze all code the best it can with the types it can infer, so you may get better results using it over mypy (or in addition to mypy) for your internal verification.

Dr-Irv commented 3 years ago

@erictraut wrote:

I recently added a mode to the CLI version of Pyright that reports the "type completeness" of a package's public interface surface. It identifies all public symbols, evaluates their types, and reports which ones are missing type information (e.g. a parameter or a return type without an annotation). Several library authors have started to use this tool. It could be incorporated into a CI pipeline. For more details, see this documentation. I welcome any feedback and am willing to make changes and improvements based on your suggestions.

Just to see what happened, I took the released version of pandas 1.2, and just put a py.typed file in the pandas distribution and tried this out. Lots of things are reported, but there is a bug:

  error: An internal error occurred while verifying types: "RangeError: Maximum call stack size exceeded

followed by a call stack. I tried starting node with --stack-size=16000 and then obtained no output and this message:

Emptying type cache to avoid heap overflow. Heap size used: 1553MB

Having said that, the report that was printed before the internal error reveals that there are a number of classes/methods that pyright inferred as being part of the public API that are not, for example:

error: Type partially unknown for class "pandas.core.algorithms.SelectNFrame"

It tried to do verification on subpackages of pandas.core and pandas.compat among others. So we do need a way to distinguish the public API from the internal one. I know this is difficult for pyright to figure out, but suggestions from you on how to modify the pandas code to do make it easier would be worth discussing, or if there was a way to indicate which packages to check with the --verifytypes option would be useful.

erictraut commented 3 years ago

We've established deterministic rules for distinguishing public from private symbols. These have already been discussed in the typing-sig, and we have general agreement on them, although they haven't yet been included in a ratified PEP. They are all included in the document I linked to above. As I said, these rules should be unambiguous and deterministic. If you have any questions or suggestions, let me know.

I can take a look at the internal error. It's quite possible that it's struggling to deal with really deep call chains in areas of the code that have no return type annotations, which requires it to do a bunch of work to try to infer return types based on the code.

erictraut commented 3 years ago

I just tried to repro the problem you're seeing with an internal error. I installed the release version of pandas 1.2 and added a py.typed file, like you did. I'm not able to repro the internal error you're seeing. It completed fine for me. Here are the results:

Public symbols: 21152
  Symbols with unknown type: 19761
  Functions with missing docstring: 14553
  Functions with missing default param: 3
  Classes with missing docstring: 1208
Type completeness score: 6.6%

Completed in 33.121sec

Any thoughts on what I might be doing different from you?

Dr-Irv commented 3 years ago

Any thoughts on what I might be doing different from you?

Maybe our versions of node, python and pyright are different? I'm also doing this on Windows

Here's my version list:

Also to run it, I had to do a cd c:\Anaconda3\envs\base38\Lib\site-packages so it would find pandas.

erictraut commented 3 years ago

Your configuration looks similar to mine. The only difference is that I'm running on MacOS, using a virtual environment, and using Python 3.9. I doubt if any of those would make a difference though.

It looks like a big portion of the untyped symbols are in the "pandas.tests" module and its submodules. Is it intended that the pandas tests are distributed with the released package? Do you intend for consumers of pandas to directly import pandas.tests.*?

If I manually delete the tests directory, the type completeness report looks much better (and runs much faster).

Public symbols: 4996
  Symbols with unknown type: 3841
  Functions with missing docstring: 721
  Functions with missing default param: 3
  Classes with missing docstring: 103
Type completeness score: 23.1%

Completed in 3.919sec
jakebailey commented 3 years ago

@Dr-Irv I think we've seen issues in Pylance (your pytest one) where we couldn't reproduce something that you could; maybe there's some problem with using conda (where we were just using pip). I thought that --verifytypes would work regardless of the directory you're in, as it accepts a module path and is supposed to go through search paths to resolve the import (as it would do in user code).

Dr-Irv commented 3 years ago

It looks like a big portion of the untyped symbols are in the "pandas.tests" module and its submodules. Is it intended that the pandas tests are distributed with the released package? Do you intend for consumers of pandas to directly import pandas.tests.*?

No, and we have an issue about that: https://github.com/pandas-dev/pandas/issues/30741

So, I just did mv tests _tests and then it ran fine, but my statistics are different:

Public symbols: 317
  Symbols with unknown type: 277
  Functions with missing docstring: 26
  Functions with missing default param: 0
  Classes with missing docstring: 4
Type completeness score: 12.6%

Any idea as to why I would only pick up 317 public symbols and you picked up 4996?

erictraut commented 3 years ago

As @jakebailey said, you shouldn't need to switch to the pandas directory to run the analysis. If you find that's required, that points to something wrong with your configuration. You should set up your Python environment (e.g. using a virtual environment) such that if you run "python3", you'll get the correct environment where pandas is installed. Pyright will use the current Python environment when it's invoked.

Here are the command-line switches I'm using: pyright --verifytypes pandas.

jakebailey commented 3 years ago

In VSC, the Python extension chooses the interpreter, so we don't have to pick a binary (versus the CLI), but I think both hint at some weird path oddities. Maybe some more verbose logs could list out the search paths being used. (I'd test conda, but arguably I shouldn't be "working" on a weekend... 🙂)

Dr-Irv commented 3 years ago

As @jakebailey said, you shouldn't need to switch to the pandas directory to run the analysis. If you find that's required, that points to something wrong with your configuration. You should set up your Python environment (e.g. using a virtual environment) such that if you run "python3", you'll get the correct environment where pandas is installed. Pyright will use the current Python environment when it's invoked.

OK, I figured it out. When I first ran it, I ran from the folder that contains the pandas source I use for development. When I ran from a different directory, everything was fine and picked up 4996 public symbols.

Dr-Irv commented 3 years ago

@erictraut I had an idea to use info from the way we generate docs to determine the public API that is missing typing info by matching the pyright report to a list we have for doc generation, but found that pyright --verifytypes is reporting errors on things that I don't think it should. Here's one example:

  error: Type partially unknown for variable "pandas.core.series.Series.axes"
    Type partially unknown for class "property"
      Type partially unknown for function "fget"
        Return type partially unknown
      Type partially unknown for function "__get__"
        Return type partially unknown

And here's the declaration for Series.axes :

    @property
    def axes(self) -> List[Index]:

(Note - if you think we should move the issues with pyright --verifytypes to an issue in pyright, let me know.)

The other issue in doing this matching is that our public API is Series.axes because Series is imported from pandas.core.series in pandas/__init__.py, so pyright reports the error as being in pandas.core.series.Series.axes rather than pandas.Series.axes and I don't see an easy way to manage that mapping.

erictraut commented 3 years ago

This does look like a potential bug in pyright's type verification logic. I'd appreciate it if you'd open an issue in the pyright repo. I'll take a look at it.

All types are reported using their fully-qualified names and not aliases re-exported from higher-level submodules. I could look at emitting all aliases in addition to the fully-qualified name. That could be tricky, but I'm willing to investigate if it would be useful to you.

If you're using the output of verifytypes in an automated manner, you might be interested in the fact that pyright also supports a JSON output (pyright --verifytypes pandas --outputjson). This gives you structured output that is easier to use in automated tooling.

Dr-Irv commented 3 years ago

@erictraut I've started fixing a few simple things using --verifytypes with pyright to find them. But I'm stuck in how to handle arguments in pandas methods that come from numpy . numpy currently doesn't have type stubs, and in VS Code, there are type stubs provided there, but they are not in the https://github.com/microsoft/python-type-stubs repo, so how can I get the command line version of pyright to find some version of numpy type stubs?

erictraut commented 3 years ago

Yeah, numpy types will be a problem. In the long term, we'll want to get the numpy maintainers to do the same thing you're doing for pandas — to include inlined types and packaged stubs in the numpy distribution. According to their roadmap, numpy 1.20 will contain type annotations.

I see three options:

  1. Download and install numpy 1.20 from sources rather than using the currently-published 1.19.
  2. Manually install one of the open-sourced numpy stub packages. One that I'm familiar with is data-science-types, but this unfortunately conflates numpy and pandas stubs into the same stub package.
  3. Conditionally define "dummy" types for the few core numpy types used within the pandas code for now. For example, create a file called _dummy_numpy_types.pyi and conditionally import this within an if TYPE_CHECKING: statement so they are used only by type checkers.
erictraut commented 3 years ago

One way for you to make fast progress in some areas is to integrate the stub files for the _libs directory. Simply copy the .pyi files from the python-type-stubs repo.

TomAugspurger commented 3 years ago

cc @BvB93 since you've been working on NumPy type annotations (just for your awareness, of https://github.com/pandas-dev/pandas/issues/28142#issuecomment-766128669. I think nothing to do right now).

@Dr-Irv What version of NumPy do you have in your environment? Can you try with NumPy master? Also some docs at https://numpy.org/devdocs/reference/typing.html that might be helpful.

Dr-Irv commented 3 years ago

@Dr-Irv What version of NumPy do you have in your environment? Can you try with NumPy master? Also some docs at https://numpy.org/devdocs/reference/typing.html that might be helpful.

I had numpy 1.19.5 . Have now created an environment using numpy master and checking pandas types using that.

Dr-Irv commented 3 years ago
  1. Download and install numpy 1.20 from sources rather than using the currently-published 1.19.

@erictraut Tried this and now the pyright --verifytypes on pandas is taking a long time (over 5 minutes). Will open up issue in pyright repo for that.

Dr-Irv commented 3 years ago

My suggestion is to manually go through, file-by-file, and migrate the typestub files to inline types. This will require some effort by maintainers who are experienced with typing. Especially early on, there might be some helpers that need to be migrated first before later files can integrate them. My hope is that once we've ironed out the process, we can engage pandas' pool of volunteer contributors to do the long tail of files.

I'm not sure the file-by-file method is the way to go. The pyright tool can give us some guidance.

I've been using the pyright --verifytypes to analyze where we are with pandas. It helps to pinpoint what things to work on first, because it outputs a message of the form "error: Type partially unknown for..." for each missing thing, and if the same class/function is used from other pandas classes/functions, it will appear multiple times. So I ordered them and started hacking away.

In master, it reported 18,161 "Type partially unknown" messages. After the simple PR I submitted at #39348 that reduced the number to 16654. I have made changes locally to fix ExtensionArray and ExtensionDtype and that lowers it to 15199. I'll submit a PR for that once #39348 is merged in.

So this is at least a way to find the places that are missing things. The next one I plan to look at is Index.

jakebailey commented 3 years ago

Note that the differences between the stubs in our repo and the ones in pandas aren't just completeness (or lack of unknowns, which we still have), it's also extra coverage and a few more overloads, so checking verifytypes won't quite be the same as actually checking for differences, I don't think.

ryanpeach commented 3 years ago

Have we considered using PEP 561 to do type hinting for pandas dataframes? For instance a dataframe could be represented as Annotated[pd.DataFrame, Columns[('col1', float), ('col2', int), ...], Shape[N, 3]] or something like that?

jakebailey commented 3 years ago

I assume you mean PEP 593 (which is where Annotated comes from).

For the most part Annotated doesn't really have a meaning to type checkers unless you start making heavy use of plugins to start escaping the type system and do fancy stuff (which means locking into a specific type checker for the more interesting stuff).

You may be interested in PEP 646, which is likely to be ratified soon and is available in pyright and (I think) pyre.

MichaelKarpe commented 3 years ago

Dear pandas team, @jreback, @TomAugspurger, is there a timeline about integrating typing stubs into pandas? Thanks

jbrockmendel commented 3 years ago

There is an ongoing effort to improve annotations, but no specific timelines.

MarcoGorelli commented 2 years ago

Reckon it's OK to add a py.typed file for 1.4, even though annotations aren't complete yet?

jakebailey commented 2 years ago

If you add py.typed, that means that the pandas lib should be treated as type complete and the canonical source of info, and type checkers will prefer it over any other installed stubs (including stubs shipped by Pylance, various other stub projects); if the typing isn't complete yet, it may be a worse experience without much of a workaround (the preference is defined in PEP 561).

I know there are a lot of types for pandas still in Microsoft's python-type-stubs repo; I'm not sure where the effort to merge those in here for parity currently stands.

zero323 commented 2 years ago

@jakebailey

If you add py.typed, that means that the pandas lib should be treated as type complete and the canonical source of info, and type checkers will prefer it over any other installed stubs

This doesn't seem right. Since you mentioned PEP 561 it states that first in the resolution order are

  1. Stubs or Python source manually put in the beginning of the path. Type checkers SHOULD provide this to allow the user complete control of which stubs to use, and to patch broken stubs/inline types from packages. In mypy the $MYPYPATH environment variable can be used for this.

and inline hints have second lower precedence (with only typeshed being lower).

  1. Inline packages - if there is nothing overriding the installed package, and it opts into type checking, inline types SHOULD be used.

So with inline packages you can still override any type hint, just cannot depend on the typeshed.

This can be easily tested (corresponding code can be found here)

asciicast

As of that:

be treated as type complete

partial should apply to standard packages as well

Regular packages within namespace packages in stub-package distributions are considered complete unless a py.typed with partial\n is included.

Dr-Irv commented 2 years ago

Reckon it's OK to add a py.typed file for 1.4, even though annotations aren't complete yet?

I'd like to discuss the idea I sent out via email about maintaining the stubs as a pandas-stubs project that can be maintained independently of whatever we put in the pandas project. We could start by pulling over the Microsoft stubs.

MarcoGorelli commented 2 years ago

Let's discuss tomorrow - there'd need to be a way of making sure it doesn't go out-of-date. Also, there's already https://github.com/VirtusLab/pandas-stubs