EntilZha / PyFunctional

Python library for creating data pipelines with chain functional programming
http://pyfunctional.pedro.ai
MIT License
2.41k stars 132 forks source link

add Typing for Stream and Sequence #162

Open weditor opened 3 years ago

weditor commented 3 years ago

add Typing for Stream and Sequence

weditor commented 3 years ago

about pylint errors I run travis.sh locally, found:

  1. coverage declined , because of some overload function.
  2. some pylint error, are all false positives.

what should I do next? waiting for maintainers to modify coverage and pylint rules, or do it myself, or do something else?

about typing hint

most typings works well with mypy and vscode( Pylanse as LSP),but not PyCharm。

in vscode, when I write ret = seq(["a", "b"]).map(lambda x: x*2) , vscode will know the map expect a function of str -> ...,then it treat lambda x: x*2 as str -> str ,then it will know ret is Sequence[str]

but in Pycharm, it will ignore the context, it can only see the lambda itself, then treat it as Any -> Any , so, there is no code intelligence when you write lambda, and cannot guess the type of ret 。The same problem will also appear in builtin map and filter

A possible solution for pycharm is to use cast: seq(['a', 'b']).map(cast(Callable[[str], str], lambda x: x*2)), it will help pycharm guess the return type, but still have no code intelligence in lambda itself.

There will be no such problems when using VScode(pylance),it's smart enough.

But there will be some problems when using mypy, that's all because of _wrap function in pipeline.py, in general, _wrap(iterable) will return Sequence , but not when _wrap(set)_wrap(dict) 。I don‘t want to simply use Union[T, Sequence], but , unfortunatly,I can't find a way to write its precise type declaration,finally, I use overload, just like:

@overload
def _wrap(v: set) -> set: ...
@overload
def _wrap(v: Iterable[T]) -> Sequence[T]: ... 

but, set and Iterable overlaped, they cannot overload, it's an error in mypy.

I don't know if those are what we wanted.

EntilZha commented 3 years ago

I commented on the other issue, its fine if coverage goes down as long as its reasonable. All the other checks should pass and it would be a good idea to add a mypy checker on unit tests to test the types themselves.

I can take a closer look at the pylint errors later today or tomorrow, its possible to disable these on a line by line basis, there should be examples in the pylint documentation as well as some scattered throughout the code. As long as its disabled on a line/function for a good reason, I'm fine with it in general.

For some of the TypeVars, it would be great to give these more readable or at least traditional names (e.g., things like A or B in https://www.scala-lang.org/api/2.12.9/scala/collection/Iterable.html).

I'll take a closer look at the other issues when I get some time. Thanks for the quick/great work!

weditor commented 3 years ago

@EntilZha do you mean _T_co = TypeVar("_T_co", covariant=True) ? The underscore at the beginning is to prevent others from importing. I see many stub files named their TypeVars like this.

eg: https://github.com/python/typeshed/search?p=1&q=typevar

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

codecov[bot] commented 3 years ago

Codecov Report

Merging #162 (b8f555a) into master (d757a90) will increase coverage by 0.13%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #162      +/-   ##
==========================================
+ Coverage   97.98%   98.11%   +0.13%     
==========================================
  Files          12       12              
  Lines        2237     2393     +156     
==========================================
+ Hits         2192     2348     +156     
  Misses         45       45              
Impacted Files Coverage Δ
functional/pipeline.py 98.57% <100.00%> (+0.49%) :arrow_up:
functional/streams.py 97.19% <100.00%> (+0.32%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d757a90...b8f555a. Read the comment docs.

gbrennon commented 3 years ago

hey folks, how can I help u to merge this? i would love type hints on pyfunctional

EntilZha commented 3 years ago

It would be great to have a type checker included to test the types themselves. I’m not sure how, but having a test to check types of a basic pipeline would be great. I’ll take a second look at the names today as well.

The merge conflict should be easy to resolve, but would also need to be resolved.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bmsan commented 2 years ago

@EntilZha regarding

It would be great to have a type checker included to test the types themselves. I’m not sure how, but having a test to check types of a basic pipeline would be great.

There is a package that does exactly that: https://github.com/agronholm/typeguard You can find some usage examples here: https://typeguard.readthedocs.io/en/latest/userguide.html

I am not sure how "heavy" the typeguard package is in terms of installation and also I am not aware of the runtime overhead induced, but it could be a nice addon.

I saw that some pip packages have flavors depending on what your are looking for. You could have pyfunctional that comes without typeguard and pyfunctional[typeguard] which has it enabled.

EntilZha commented 2 years ago

@bmsan I was thinking for type checking being part of the test/lint pipeline, not part of the code that the library itself runs. Basically looking to have a way to run automated tests to make sure that (1) the types added are correct and (2) if functions change, that it forces updated type annotations

bmsan commented 2 years ago

@EntilZha got it. Thanks! For (1) I think mypy could be used for this purpose. It does static type checking based on typing hints.

antonkulaga commented 2 years ago

I personally run beartype, it does good job of checking types in the runtime. Maybe you can just use beartype-d functions in the test and check if they crash? I am really looking forward for generics like seq[T]

EntilZha commented 2 years ago

Hi @antonkulaga @bmsan (and others). Just to avoid any miscommunication, I'd merge this PR (or a modification of it) if it added an automatic linting step to make sure that the current types work correctly and to ensure that future PRs don't accidentally break typing. Bonus points for attaching a screenshot of VS Code working correctly :).

I don't personally have time to add this and test it, but am happy to review/merge a PR that adds it. Do you think @weditor you could add mypy for static type checking as a lint step? Feel free to explore beartype for checking types during unit tests. It's not necessary, but could be nice.

If you happen to be busy, perhaps @antonkulaga or @bmsan could make a parallel PR that adds mypy or beartype to the lint step (without adding new type hints), and then it would be easy for me to rebase this PR on top of that and check it.

Thanks all for the input/comments/discussion and sorry for being a bit slow on this.

Drvanon commented 1 year ago

I created a fork (robindorstijn/PyFunctional) with a minimal commit to the travis configuration that should run mypy. Mypy does static analysis of types which confirms proper use of types.

EntilZha commented 1 year ago

@Drvanon You should be able to make a PR from that it trigger builds here I think?