caleb531 / automata

A Python library for simulating finite automata, pushdown automata, and Turing machines
https://caleb531.github.io/automata/
MIT License
339 stars 64 forks source link

[WIP] Type hints #131

Closed eliotwrobson closed 1 year ago

eliotwrobson commented 1 year ago

Work in progress branch to introduce type hints. Should probably wait until https://github.com/caleb531/automata/pull/121 and https://github.com/caleb531/automata/pull/129 get merged before doing the rest, since these will alter the type signatures that get used.

Also, need to add a mypy step to the CI workflow to check these annotations. Part of this PR (or at least the push to include type annotations) should be to run black (or similar autoformatter) over the library. The function signatures with the annotations get pretty messy and it's annoying to have to fix these by hand.

coveralls commented 1 year ago

Coverage Status

Coverage: 100.0%. Remained the same when pulling ae04a6345f5137a3c072f247fe088f1d61831f90 on eliotwrobson:type-hints into bbfe4fbd47409d6d4e431382fcb4c8ed5f4ce5cd on caleb531:develop.

caleb531 commented 1 year ago

@eliotwrobson A few thoughts and questions:

New merge conflict

It looks like the recent merge (from maybe #121) introduced a merge conflict in the nfa.py module. Just wanted to let you know

Type name conventions

Is suffixing type variables with T an appropriate convention for custom Python types? I do recognize that some languages (like C#) have conventions like the I prefix for interface names, but wasn't sure what is the case with Python types.

I mean, a T suffix would certainly make sense for custom types, because (for example) NFAStateT would look too much like an actual class.

Type annotations for everything else?

Your PR looks great so far, and thanks for adding type annotations to the PDAConfiguration and PDAStack classes. But if we're gonna embrace types in this library, every class/module should have type annotations.

How easy would it be to add types to the remaining DPDA, NPDA, DTM, NTM, and MNTM classes? It's perfectly okay to say "that's too much work / out of scope for me" or something—I can do them at the end of the day.

I guess I just ask because I've barely worked with types in Python and don't know that side of the stdlib very well, so you would probably be a lot faster than me at adding those annotations. But at the end of the day, I don't want to pressure you to do more than you're prefer to do.

Other things

I left you miscellaneous other comments in the PR, mostly small observations or questions I had around design desisions. Please review when you have time and let me know your thoughts!

eliotwrobson commented 1 year ago

@caleb531

Is suffixing type variables with T an appropriate convention for custom Python types? I do recognize that some languages (like C#) have conventions like the I prefix for interface names, but wasn't sure what is the case with Python types.

I'm actually not sure, from the other type annotations I've seen, there's not really a standard convention, I just do this sometimes to make it easier to tell what's a type and what's a class. But it's no trouble to change to/from this convention depending on what you prefer. Maybe there's a convention documented somewhere?

Your PR looks great so far, and thanks for adding type annotations to the PDAConfiguration and PDAStack classes. But if we're gonna embrace types in this library, every class/module should have type annotations.

Definitely agree, I only stopped at the NFA/DFA because I ran out of time the day I was doing this but wanted to put up the work in progress as soon as possible.

How easy would it be to add types to the remaining DPDA, NPDA, DTM, NTM, and MNTM classes? It's perfectly okay to say "that's too much work / out of scope for me" or something—I can do them at the end of the day.

Actually not hard, the most difficult files to annotate are the NFA/DFA because these are the longest.

I guess I just ask because I've barely worked with types in Python and don't know that side of the stdlib very well, so you would probably be a lot faster than me at adding those annotations. But at the end of the day, I don't want to pressure you to do more than you're prefer to do.

I'm happy to! (and intend on doing so as part of this PR before merging). But if you'd like to take a stab at it, you're certainly welcome to. If you want to work with types some in Python, I actually think adding annotations to a library like this is a good way to learn.

EDIT: Actually, I would encourage you to take a stab at some of this at least, since you'll need to set up typechecking as part of the CI job, and I think having some familiarity with the internals would be valuable 😃

caleb531 commented 1 year ago

@eliotwrobson Just a heads up—force-pushed to your branch a correction to the import order for the Self fix. Just wanted to let you know in case you already pulled my original (now lost-in-space) commit.

eliotwrobson commented 1 year ago

@caleb531 are you able to run mypy . locally without any errors being thrown? It seems to have issues with the Self type, but I think this might just be my machine for some reason.

caleb531 commented 1 year ago

are you able to run mypy . locally without any errors being thrown? It seems to have issues with the Self type, but I think this might just be my machine for some reason.

@eliotwrobson No issues on my end, apparently (using mypy 1.1.1).

> mypy --version
mypy 1.1.1 (compiled: yes)
Screenshot 2023-03-30 at 7 16 53 PM
eliotwrobson commented 1 year ago

@caleb531 ahhh, I think I have a really old mypy version, that must be the issue.

Forgot to mention in a previous comment: The next item related to this PR for me is going to be adding annotations to the regex library. Those are a little on the sophisticated side and might take me a while, so if you want to try your hand at annotating the other classes (TM, DTM, etc.) you're welcome to commit directly to my branch. I think that things will go much faster that way 😃

caleb531 commented 1 year ago

@eliotwrobson Yes, I'd be down for trying annotations for those other classes! It's starting to sound kinda fun, actually lol.

My only ask is that you fix the conflict however you see fit. I noticed your develop branch is very out-of-date with mine, so not sure if you want to merge into your feature branch, rebase, or what.

eliotwrobson commented 1 year ago

@caleb531 Will go ahead and rebase, should be pretty simple.

With respect to removing flake8-isort, here's a guide on how to set up black with other tools: https://black.readthedocs.io/en/stable/guides/using_black_with_other_tools.html

Basically, you need to set some minor configuration options, but flake8 plays really nice with black if you set up the plugin.

EDIT: Rebased, should be good to go!

caleb531 commented 1 year ago

@eliotwrobson A change I'm thinking of making in this PR, for better abstraction: instead of using the type str everywhere to represent the type of automaton symbols, let's assign that to a variable and use that variable everywhere. e.g.:

AutomatonSymbolT = str

This would make the vast majority of symbol-related type annotations a bit more agnostic to the specific type of symbols currently supported. So in the future, if we support another type of symbol, it's easy to extend all existing type annotations with a one-line change. I would also be happy to make this change, as long as I'm not stepping on your toes. Thoughts?

eliotwrobson commented 1 year ago

@caleb531 sounds like a great idea! I'm busy for the rest of today at least so feel free to go right ahead.

caleb531 commented 1 year ago

@eliotwrobson On second thought, not going to change the type for now because it's gonna get confusing/tricky to define higher-level semantics for everything represented by str in the library. For example, would the type for input_str change? Would we also be creating a new type to represent a word (e.g. AutomatonWordT)?

And then what about things like the regex parser? Looking through the parser code, I see many places where str is used as a type (as you've committed recently), but I'm not sure I can tell when str refers to a word vs. a symbol.

caleb531 commented 1 year ago

@eliotwrobson Also, I've finished merging in my formatting changes into your branch. I opted to do a merge instead of a rebase because it made the conflict resolution a little easier, but I tried my best to preserve all the type annotations you've added so far.

EDIT: Also, just a heads up that I've re-added the GitHub Action for linting. so the CI will complain at you if you push up anything that doesn't comply with flake8/black (although if you use VS Code, the black auto-formatting on save should be active again).

eliotwrobson commented 1 year ago

@caleb531 sounds good! I think it's fine to wait until we get a first round of type annotations before doing something more intricate.

eliotwrobson commented 1 year ago

@caleb531 now that the annotations are getting more complete, I think it's a good opportunity to start setting up type checking tests. I'm not sure how this is typically done for packages (not many packages actually have inline type hints), but I think just running mypy before running tests should be enough on CI.

To make the type stubs visible externally, I think we need to add a file as in this stack overflow post: https://stackoverflow.com/questions/62979389/is-there-a-best-practice-to-make-a-package-pep-561-compliant

caleb531 commented 1 year ago

@eliotwrobson Hm, I was looking at flake8-mypy as a potential option (so that we can have all static code analysis managed through flake8), but it seems there may be some version compatibility issues (or may not be, one of us just needs to take a closer look). But if mypy needs to be run separately, I would prefer it be part of the lint action, rather than tests.

I have a high-priority bug that I need to work on fixing for one of my other projects, so I don't have time to look into this at the moment. But let me know your thoughts, and I'll be happy to review anything you contribute further!

eliotwrobson commented 1 year ago

@caleb531 no worries! At some point I'll try and take a look at some other libraries with integrated types and see how they're configured. Though It looks like flake8-mypy hasn't been maintained for some time. I'm not exactly sure what the preferred configuration is for modern projects.

eliotwrobson commented 1 year ago

@caleb531 in the latest push, I've added a py.typed file as described here: https://blog.whtsky.me/tech/2021/dont-forget-py.typed-for-your-typed-python-package/

I didn't add type annotations to test cases, but I was wondering what else you'd like to see done before this gets merged. The changes here are pretty extensive, so I think it might make sense to do follow-up typing in the test files so that other branches don't get too out of sync with this one.

caleb531 commented 1 year ago

@eliotwrobson I agree that I don't want this PR to sit for too long, especially given the other PRs we want to wrap up. I think we are in a place where we can merge this one safely, and then follow up with additional type PRs (e.g. for better mypy integration, types in tests, etc.). But as I see it, because mypy isn't currently complaining about the lack of types in tests, I think it's fine to prepare this PR for final review / merge.

In other words, I don't have anything else for this PR. I am honestly ready for a final review / merge at this point.

eliotwrobson commented 1 year ago

@caleb531 agreed. Since this is going into the develop branch I think it's fine to table some of the discussion items since they were pretty minor anyway. My only major item was a few missing type annotations but that can be handled in a followup.