dgilland / pydash

The kitchen sink of Python utility libraries for doing "stuff" in a functional way. Based on the Lo-Dash Javascript library.
http://pydash.readthedocs.io
MIT License
1.28k stars 89 forks source link

Type chaining #182

Closed DeviousStoat closed 1 year ago

DeviousStoat commented 1 year ago

Here is my take on typing the chaining interface.

The idea is to make the chain class generic on the virtual current value (the one that would be computed at this stage) and generate a stub file for the chain class that has all the exposed functions of the library as methods. The self parameter is the chain class generic on the expected first parameter of the function and the return type is another chain class generic on the result of the function. This seems to work well.

Some possible improvement points:

coveralls commented 1 year ago

Coverage Status

Coverage: 100.0%. Remained the same when pulling a6dfe82736c6b44aded9d9dfc2317b7ca64303ff on DeviousStoat:type-chaining into e3d04c580cb86e6d189c8e1c587fe59eccf33c96 on dgilland:develop.

dgilland commented 1 year ago

Excellent work! I don't see any major issues in the code after a first-pass review.

To address your points:

The imports are hard coded in the generator, maybe we could try to dynamically fetch them. This would be less error prone. But we would also need to add all the type variables used. Maybe we could add all these in pydash.types and the modules would import from there. Then we would just have to import all from pydash.types in this script.

Where are the hard coded imports? I see a --imports parameter but I don't see where that's being passed in. Looks like import is an empty list otherwise?

Maybe add more typing tests or think of a clever way to test the stub file. Because it is cheating a bit, it uses Chain as self types but it is in a AllFuncs class which mypy doesn't like, hence the exclude.

Do you think the tests we have for chaining + all the other types has sufficient coverage? Wondering if there's enough confidence with what's there now or if something changes that breaks typing that it would get caught by the tests? The pydash API is pretty stable so I don't anticipate any major changes to argument types so seems like it's more of a concern of not taking other type usage into account.

Also maybe add some way to automate running the generator, maybe through CI, as this needs to be updated on every typing change

Is there a way to assert/check that the generate types are out of sync? If there's a way to include that as a test, then that would be preferable. Thinking something similar to the flake-black plugin that can check whether black needs to be run or not. So if there's not an easy way to statically check, then maybe something added to tasks:lint that could run the generator but output into a temp file and then perform a diff on the one in the repo. Open to other suggestions, though.

DeviousStoat commented 1 year ago

Where are the hard coded imports? I see a --imports parameter but I don't see where that's being passed in. Looks like import is an empty list otherwise?

I mean all the imports in the BASE_MODULE here. All the imports needed to make typing work in the stub file. The --imports parameter was used for testing. But it is not used any more. I removed it.

Do you think the tests we have for chaining + all the other types has sufficient coverage? Wondering if there's enough confidence with what's there now or if something changes that breaks typing that it would get caught by the tests? The pydash API is pretty stable so I don't anticipate any major changes to argument types so seems like it's more of a concern of not taking other type usage into account.

Since chaining simply reuses the functions' types, I didn't add too many typing checks for chaining, just enough to check that chaining methods works. But if the functions' types are right, chaining should be typed well too. However I don't think we have all typing paths checked in the typing tests. They could be added later if we find some typing problems to validate a fix. I think they should be enough to validate a change for the main use cases. To be honest even just running mypy should be able to catch most problems with argument types changes.

I removed the stub file from the excluded list of mypy, just ignored the misc error code. It should catch most problems in the generated stub, already made a few fixes from it.

Is there a way to assert/check that the generate types are out of sync? If there's a way to include that as a test, then that would be preferable. Thinking something similar to the flake-black plugin that can check whether black needs to be run or not. So if there's not an easy way to statically check, then maybe something added to tasks:lint that could run the generator but output into a temp file and then perform a diff on the one in the repo. Open to other suggestions, though.

I think your proposed solution is the easiest and it would work very well so I added it. A new chaining_type_update_required linting function which does what you said. Unfortunately I have to check the python version and only run it on python >= 3.9 because before that the ast module doesn't have the unparse function which is crucial for the stub file generation and since tox runs the lint from 3.7, we need to enable it only then. I don't think it really matters since the generation is meant to be run on the developer's machine beforehand and the stub file generated is itself compatible with python >= 3.7 which is checked by tox.