Open sveinse opened 1 year ago
I'm fine with and definitely welcome type hinting, especially if it means that structured type info lands in the documentation as well.
Regarding None
as "unspecified" or "not supported" etc., I think that is a very valid and concise expression. Even Python itself uses it as such, e.g. by returning None
from a function not returning anything explicitly. As for argument default values, I don't care much whether for example an empty string ""
or None
is used, as long as the intention is clear. That means probably an Optional
type hint in most cases, but I think reading def frobnicate(foo: str, bar: str = None)
is very clear and adding Optional
to that only makes it longer without real benefit. IDE code parsers do usually understand that a keyword argument with default value can be omitted. In this case, def frobnicate(foo: str, bar: str = "")
is slightly less obvious, but also fine with me. Any code relying on None
and ""
being different things here is ill-designed anyway.
In your fork, why do you have the from __future__ import annotations
thing? Isn't that a Python2 only thing, which we're not aiming for in new code anymore?
None
(or any other falsey expressions) is a valid way to declare optional values, and it is used quite often in canopen. However, dealing with multiple or optional types, also requires the code to handle all variants of the values in all places.
Let me give an example: class NmtBase
doesn't initalize self.network
and set it to None
. The proper value of self.network
is injected later from e.g. RemoteNode.associate_network()
. Functionally this is fine. Calling NmtBase.send_command()
with self.network
is None
will crash the program. But the type checker complains about this construct since calling self.network.something()
when self.network
is None
is invalid. The principal question here becomes: Is this how we want it? If it is None
let it crash? Let the type checker complain. Or is the type checker pointing out a valid weakness in the code?
class NmtBase(object):
def __init__(self, node_id: int):
self.id = node_id
self.network: Optional[Network] = None
def send_command(self, code: int):
super(NmtMaster, self).send_command(code)
logger.info(
"Sending NMT command 0x%X to node %d", code, self.id)
assert self.network # <-- This is needed to instruct the type checker to ignore None
self.network.send_message(0, [code, self.id]) # <-- Can't do this without checking self.network
Using from __future__ import annotation
gives postponed evaluation in type annotations. E.g.
class NmtSlave(NmtBase):
def __init__(self, node_id: int, local_node: LocalNode): # With __future__ import annotations
def __init__(self, node_id: int, local_node: "LocalNode"): # Without
It is optional from py3.7 and is not yet made mandatory. The feature is described in PEP 563. See https://docs.python.org/3/library/__future__.html
Go for it! Static analysis will definitely find some issues, but hopefully the fixes are fairly easy like asserting that attributes are not None like in your example.
What is reasonable accept criterias for type annotations in canopen? What is the definition of done? What tools should we aim compliance for (and with what settings)?
Based on what I usually do I usually use mypy and pyright (the latter because I'm using VSCode) for type checking and pylint and/or flake8 for general linting. As pointed out in #346 , pylint is a commonly used tool.
I've been working a lot with type annotating canopen lately as can be found in PR #360 . It has taken a lot more time than I expected, but I hope it can be of use. canopen uses a lot of dynamic types in variables and it was much work to figure that out. It is mostly done, but there is a few remaining errors to deal with. Please review the PR and give comments. Large or small.
I've collected a series of comments and questions. I'm sorry for the length of this post, but it would help a lot if we could get some insight into this.
This is a general type annotation advice: When using vars that can hold multiple types, e.g. int
, str
or None
, it is important to always expect all variants of the variable in the code. The type checker has (rightfully so) problems understanding which type might be passed and complains a lot if assumptions are made.
There is a lot of this in canopen, and this is what's what have taken the most time to resolve. Its also harder to code with, because one have to test everything all the time. I'm sure there are lots of ´Optional` uses in canopen that can be removed, but its not always clear if the optional value serves a function or merely a convenience or habit.
Due to the dynamic nature of (canopen OD) objects (variable, array, record) and .raw
that might return all sorts of object types, it make it inherently difficult to do properly type annoation of sdo requests.
# This creates erros in type checkers
var: int = sdo[0x6061].raw
# This is what's needed to do full type checking
obj = sdo[0x6061] # obj might be Variable, Array, Record
assert isinstance(obj, Variable) # Limit type to Variable
data = obj.raw # data might be bool, int, float, str, bytes
assert isinstance(obj, int) # Limit type to int
var = data # Now its guaranteed that var is int
Is the current design too flexible? It is impossible for the type checker to know that OD 0x6061
is a Variable (it might not be if the object dictionary is misconfigured), nor can the type checker understand that sdo[0x6061].raw
return an int. What about having explicit typed methods? I know its not everyone's cup of tea, but its explicit and it can be type checked. It can be checked run-time, so in case the OD is wrong, it will clearly give error. I suppose the fundamental question is: What is expected to happen if the parameter isn't Variable
, or that the type returned by .raw
is not int?
Some quick proposals as food for thought:
sdo[0x6061] # Only returns `Variable`. Fails if used on an `Array` or `Record`
sdo.get_container(0x1234) # Returns `Array` or `Record`. Fails on `Variable`
sdo.get_variable(0x6061) # Named methods for getting variable only
sdo.get_array(0x1234)
var.raw_int # Only returns int. Fails if any other types are returned
The encoding of a parameter is stored in objectdictionary.Variable.data_type
with the encoding is done in Variable.encode_raw
. However, the input type of the value this funcion is very broad. I have added a runtime type check and it will raise TypeError
if there is mismatch between data_type
and the expected type.
In objectdictionary.import_eds()
there is a rather nasty attribute injection hack that inserts __edsFileInfo
into the od object for storage. This is considered very smelly and I think we need to find a better way to save these data.
Are there any use cases where initalization of LocalNode
or RemoteNode
is not made with a node_id
value? It is declared as Optional
that allows None
, but a lot of logic in these classes crash if node_id
is None
.
class Bits
work?How does the variable.Bits
class work with the corresponding objectdictionary.Variable.decode_bits()
and .encode_bits()
? I fail to understand the logic behind the item get/set and the way the slicing works (if at all). Nor are there any examples of this in use, so any help would be appreciated.
pdo.base.PdoBase.__getitem__()
is a bit inconsistent. For some key values it returns Map
type for others Variable
. PdoBase
is a Mapping containing Map
, so __getitem__()
deviates the expected type.pdo.base.Variable.get_data()
and .set_data()
doesn't handle float datatypes at allobjectdictionary.Variable.decode_phys()
and .encode_phys()
? Should decode_phys()
return int or float?.phys
is intended to give a numerical value. However the current implementation will pass through any other types as well, leaving the type very broad. Could this be changed such that it will fail unless its an numerical type?objectdictionary.eds._convert_variable()
doesn't deal with BOOLEAN. Is that intentional?We should break this up in multiple smaller issues and not forget about it. At least we fixed (5) with #445
It's not forgotten per se, but more recent activity supersedes much of this. It's a "inbetween"-task for me in PR #360, and it needs an update.
Sorry for not considering this more thoroughly in the past year. Other things needed my attention more urgently, but I think we're on a good path to ramp up type annotations in canopen gradually. Some questions from your list above have been answered as part of different discussions, and frankly I had forgot about them here. If still needed, let me go back and summarize my stance (and current project practice) for each point raised. Which of these do you think still need clarification, @sveinse?
The hard part is not to type annotate our library: It's deciding how to deal with the type inconsistencies it uncovers. Particularly surrounding implicit use of None
.
So, how do we approach type annotation in canopen in general? Adding them via PRs is straight forward. Can we agree on which type checker to use, e.g. mypy? If we do, should we implement a github action step for it? Is typing errors considered PR-blocking errors, or is it more "nice to have"? How do we validate type annotations?
It is extremely easy to get carried away in a deep, deep rabbit hole when starting down the type annotation path. The type annotation language has limitations and python is far more flexible. I know we will hit a few of them in canopen. When these "inflexibilities" are encountered, its easy that the code gets littered with what I've dubbed "type checker aids" (e.g. asserts and or type ignores) and not functional code. TL;DR: It's very important that we agree on how deep or complete we're willing to go with type checking.
FWIW, here are my thoughts:
So, how do we approach type annotation in canopen in general?
This is a broad question, but regarding how to add them, I would definitely suggest using gradual typing over several PRs, and definitely don't fall into the trap of mixing bugfixes (discovered by type annotations) and adding annotations.
Can we agree on which type checker to use, e.g. mypy?
+1 for mypy
If we do, should we implement a github action step for it?
Yes.
Is typing errors considered PR-blocking errors, or is it more "nice to have"?
Make it a required check and fix things right away. Using mypy
s gradual typing feature makes this very convenient. You don't need to enable strict mode until typing is complete.
How do we validate type annotations?
Can you expand on this?
[...] When these "inflexibilities" are encountered, its easy that the code gets littered with what I've dubbed "type checker aids" (e.g. asserts and or type ignores) and not functional code.
IMO, it's ok to add assert
s temporarily, but then coupled with an issue and/or a comment that explains how to solve the particular issue properly.
An approach I've found usable is:
mypy
CI check (gradual typing)mypy
from gradual typing to strict modeOf course, each iteration of step 2) involves possibly a series of PRs.
So, how do we approach type annotation in canopen in general?
This is a broad question, but regarding how to add them, I would definitely suggest using gradual typing over several PRs, and definitely don't fall into the trap of mixing bugfixes (discovered by type annotations) and adding annotations.
Agreed.
Can we agree on which type checker to use, e.g. mypy?
+1 for mypy
+1 from me as well.
Is typing errors considered PR-blocking errors, or is it more "nice to have"?
Make it a required check and fix things right away. Using
mypy
s gradual typing feature makes this very convenient. You don't need to enable strict mode until typing is complete.
I doubt we will ever get to strict mode in canopen. It deals with flexible data types and the current design is very hard to describe with type hints. E.g. ODVariable.encode_raw()
accepts a value. It's behavior is not determined by the argument type (isinstance()
), but by another variable (self.data_type
). No current type checker is able to infer the correct type from this type logic and will create errors.
How do we validate type annotations?
Can you expand on this?
If we chose mypy, then that's the answer.
I doubt we will ever get to strict mode in canopen. It deals with flexible data types and the current design is very hard to describe with type hints.
One possible solution for such problems is to introduce new APIs and deprecate the old ones. Another is to make breaking behavioural changes. Yet another is to let the status quo win and never enable strict typing.
So, how do we approach type annotation in canopen in general? Adding them via PRs is straight forward.
Yes, agreed that adding vie PR in chunks is review-friendly.
Can we agree on which type checker to use, e.g. mypy?
Also +1 for mypy from my side. Therefore we have consensus and mypy it shall be.
If we do, should we implement a github action step for it?
I don't care much about this point. Sure it's nice to see, but IMO type checkers are most valuable during coding.
Is typing errors considered PR-blocking errors, or is it more "nice to have"?
Given the nature of this API, I would definitely not try to enforce it. Python is not a type-safe language, and this library embraces that fact by design. Which is a good thing IMHO, as it makes the code easy to use and mirrors the design of CANopen protocols themselves. Making sure the passed / returned objects are of a certain type should be done outside the library, by the application code that knows which objects it tries to access. Any wrapper functions for specific types should be added on top of the current generic API. It should be really easy and not much additional code. But it would enlarge our library API surface quite a bit, which I don't think is worthwhile just because of type checking.
When these "inflexibilities" are encountered, its easy that the code gets littered with what I've dubbed "type checker aids" (e.g. asserts and or type ignores) and not functional code. TL;DR: It's very important that we agree on how deep or complete we're willing to go with type checking.
I'd avoid those "type checker aids" usually, rather treating the type hints as a helpful, best-effort option. But that needs to be decided on a case-by-case basis.
ISTM a gradual typing CI check would be a good compromise; you'd type up the parts of the code base that makes sense, and a lenient mypy check should be able to help you keeping those functional without being too much of a nuisance. I believe a CI check is important, as type annotations are susceptible to bit rot, like any other code snippet. Without CI checks, you risk that they become incorrect, which can end up being a problem for users of this library.
Accepted, since mypy seems to explicitly support such a use case. Go ahead and PR the check configuration if you like. I've never really touched all the CI and GitHub related parts.
What I'd like to do next is to add an absolute minimal mypy configuration to canopen which I'm working on now. mypy have excellent guides on how to approach existing projects: https://mypy.readthedocs.io/en/stable/existing_code.html#existing-code
mypy throws ~107 errors right off the bat with a minimal configuration. If we were to introduce this into a GH action as a blocker, we will have to address all of those issues first.
If we chose not to make them blocking in GH actions, I think there is no point in running them in GH at all! Who inspects the actions logs unless they fail?
The only reason I see is if it supports pointing out type errors only related to the current PR, so that we can be sure a PR doesn't introduce new type check errors. Don't know enough about the "gradual typing" feature in mypy though to tell if it's possible.
I don't know if its possible to setup partial mypy checking of PRs, like codecov is doing for code coverage. I know commercial tools, such as sonarcloud is doing that. To be able to do so we'd need two sets of configurations: One for the whole project (which needs to be lenient in the beginning) and one stricter for a PR. I don't know if that's possible with actions or mypy.
Well, if that gets hard to get right, let's skip on the CI integration for now. But if @erlend-aasland sees value in it, I'm not opposed to adding the CI action (without enforcing the check).
All of this is possible using GH and mypy. I can look in to it in a couple of days, unless @sveinse beats me to it.
@erlend-aasland I'm curious if you have a setup for running incremental mypy and with stricter rules than the overall project.
I've found the mypy settings that gives no errors. We can use this and remove error exception one by one while fixing the code base. Not sure if that's the best approach to do this in steps thou.
[tool.mypy]
files = "canopen"
strict = "False"
ignore_missing_imports = "True"
disable_error_code = [
"return-value",
"var-annotated",
"attr-defined",
"union-attr",
"annotation-unchecked",
"operator",
"arg-type",
"assignment",
"override",
"index",
"return",
"has-type",
"no-redef",
"misc",
]
What good is a checker tool if all its warnings are silenced? I'd rather not silence them, but accept the fact that it shows a red light as long as there is something to fix. Remember, the goal is not the green light, but a working code-base with a decent amount of support for the users, so they can spot errors before the application hits them at runtime.
A silenced tool have no value, but it works as a strategy to enable incrementally option by option if we think the PR for getting all 101 errors fixed in one go is a bit too much.
But I want to see the errors while fixing them. Sorry, but silencing warnings is the opposite of right in my experience. Keep them visible, work on fixing bit by bit. It helps nobody to see a green light when it was reached by looking away.
Overall, I think this is getting too focused on the tools rather than the workpiece. Let's spend time (which isn't abundant) discussing what needs fixing instead of how to configure the tools.
I was under the assumption that we'd want them plugged into the CI/CD and we're starting from scratch not using the tools. We need to either put the bar so low that everything passes and tighten the knot PR by PR, or we need to do a rather large PR to get the level sufficiently up to level for what we consider the acceptable minimum.
I repeat my opinion that there is no use for running mypy in CI/CD unless it fails the build. Otherwise it'll never get used.
...if we don't agree on that ambition, perhaps we should begin there.
I'm not pushing for mypy to be included in CI runs. But if we do, then it should show its true result, not just pass because we basically told it to lie to us. What's the benefit of having it if any possible warnings are hidden anyway? Newly introduced errors will not be noticed, and it gives a false sense of compliance seeing these checks pass although nothing is actually fixed yet. There will be no sign of progress either, because we start off with a green light.
I do use mypy locally already. But find it's okay to see some warnings there, while focusing on different work. Sure they should be fixed at some point, but not distract from ongoing work and the fact that we do have a working code-base even without perfect typing. Remember it's just helpful pointers to avoid pitfalls caused by wrong assumptions.
Summing up, we should enforce CI checks once we have a state that passes without any silencing options. The road to that point either has them enabled but not enforced (accepting the red flag), or only running mypy locally. The former has the advantage that the analysis results are easy to look at without checking out the affected branch and running locally. If someone goes through that effort, they might as well fix what they see, since the IDE will already be open.
OOI what settings do you use where you only get warnings from mypy?
Sorry, I should have been more precise. I didn't mean to differentiate between errors and warnings. To me, both categories are just "warnings", because the code works nonetheless. So whenever I speak about warnings in type checking, I also mean errors.
I don't use any special mypy configuration. Only what we have in pyproject.toml
, which does not have anything mypy-related AFAICT. Which is fine as far as I'm concerned, the default checks should be what we target. Maybe ignore_missing_imports
for some specific packages would make sense, but I can live with the message here as well when in fact I don't have the required type information available.
@erlend-aasland do you have any experience with using mypy with GH actions? I know its straight forward to have it run as the action job and that it must pass without errors to pass the CI/CD flow. But do you know of any system where the errors/warnings are fed back to the contributor?
I use mypy
with GitLab at $WORK and we've been using several mypy
checks with GHA over at CPython[^1].
I know its straight forward to have it run as the action job and that it must pass without errors to pass the CI/CD flow.
Note that you can configure your GitHub project so it's not a required check. It does not need to pass without errors in order for the CI to be green.
But do you know of any system where the errors/warnings are fed back to the contributor?
IMO, examining the CI logs is sufficient, but I'm not sure if I understood your question correctly.
[^1]: feel free to head over to the CPython repo and take a look
canopen type annotation support is very rudimentary. My opinion is that type annotation is equally valuable to a project as unit testing is: It helps determine the expected behavior and uncovers errors. Type annotation is not just "sticking some annotations on things", it imposes a stricter, perhaps more conscious, way of coding. Examples from this projects are typically the use of default value of None
. Type annotations checks will tell you that you need to check the value everywhere before using the actual value. See #513
We need to agree on what level of annotation support we aim for. E.g. mypy strict support is a very different ballgame than e.g. using mypy for development guidance only and allowing errors to prevail. For reference: our current master gives 106 mypy errors in standard rules mode and 630 errors in strict mode. Furthermore, I strongly believe that type annotations and functional changes go hand in hand and not separately, much like how functional and unit testing goes together. E.g. PR #513
I've been spending a lot of time of my spare time on improving the type annotation support in canopen. I've tried a few different approaches through various PRs already, but none of them seems to resonate. They are too big. While I agree that we need to go through the discussions this unlocks, I'm really afraid that going very granular is going to halt progress to a very slow pace. Going for typing annotation is a big job. Yet, it'll take a long time making a bread from individual grains. So perhaps full type annotation support is not within our ambition?
I've come to a point where I don't know how to proceed and there is certainly no point in wasting your time on PRs that won't get accepted. I'm sorry, I'm strongly considering to cease my type annotation efforts. Since we want to go very granular on the PRs I'm genuinely afraid it'll take too much efforts than I'm able to offer up on spare time.
Yet, it'll take a long time making a bread from individual grains. So perhaps full type annotation support is not within our ambition?
I OTOH think it is a reasonable goal. With a code base of this size, I'd expect it to take between 5 and 8 months given the number of currently active contributors.
Thanks @sveinse for your efforts so far. I think they are definitely valuable and not wasted. The fact that so far it hasn't resonated much is from my perspective a misunderstanding / disagreement on how to approach the thing overall. Yes, #456 by @friederschueler was put down pretty quickly, unfortunately! Thanks as well for that effort, it was just too hard to review. We can still use it to see how @friederschueler had solved some problems, without re-inventing the wheel while doing smaller steps.
And for my part, I haven't had enough time to actually look at the recent suggestions. Too much going on in this project with better tests, fixes, etc. that I just deferred further typing efforts a little. But I do intend to get back to them of course. Yes, it will take a lot more time to get typing up to an acceptable level in this library.
Here's what next steps I had in mind:
Add a GHA check (but not required!) that gives us the relevant log output from mypy to actually have some common error and warning messages to discuss over.
We SHOULD NOT add any elaborate mypy configuration to the project, just leave it running at default settings in the beginning. Maybe some settings just so it finds the required modules easily. If we further discover that some things cannot be solved in the code, any needed silencing settings will come in later as part of a PR, after discussion about the other (im)possible ways to solve the issue.
This means ACCEPTING that a red light will be flashing for the foreseeable future in our GHA check. Which is the same that anyone gets by running mypy locally. And it's okay, it just shows publicly how much work is still needed to fix the existing type annotations / code.
My personal first step PR would be to add a -> None
to all class constructors which have some member annotations inside. That will get rid of the warnings about non-annotated functions not being checked. And it will instead cause a lot more errors to be reported. Which is not only OK, but a Good Thing, because it raises visibility of what needs to be done.
I'm not sure it makes sense to completely decouple added annotations from actual code fixes. Sure that's a nice approach if your project is only to "add typing information". Whether you first fix the code or first add all annotations doesn't matter much - it always results in a situation where one part has to be essentially completed before meaningfully working on the other. Even if not all parts are committed / upstreamed yet.
Wherever someone starts working on this topic, try to hit the "Create Pull Request" button before getting carried away. I know it's tempting to get rid of "all the warnings" before proposing the work for inclusion. But it's not necessary and it results in endlessly hard to review mega-patches. So please don't try to reach perfection too early.
My vision of this process is to really embrace these facts: Right now we have about 5 or 10 percent "OK". The rest is missing (not annotated) or buggy (needs code adaptations). The more we annotate, the more there will be to fix. But that doesn't mean that any annotation must also immediately result in fixing all newly introduced warnings. They can very well be left piling up for some later PR or even someone else to fix them. But this way we will make progress, in small steps. There is a lot of space between our current state and whatever 100 percent we aim for. Let's reach it incrementally.
That means in essence, I don't want anything silenced, but see the stuff we are talking about. I want discussions about concrete improvements, whether they just add annotations or also add fixes. But not too big a chunk, rather stop at some point early and let that be reviewed and merged. See #451 which went through with little discussion and didn't make anything perfect, but improved a small part in a comprehensible, non-controversial way. That's roughly the size I'd be comfortable reviewing regularly.
Does that sound like a plan? Who's up for bullet number 1 to add the ugly red light?
P.S. All of these changes should be of very low risk, as we will mostly add extra safety checks. As long as the code itself keeps working, just without these added precautions, we're not worse off than right now where stuff will simply explode at runtime when stepping on a type-confusion mine.
P.P.S. And I don't think we will target strict mode anytime soon, if ever. That would be a goal for a different API, not this one with its heritage and exploiting dynamic typing for ease of use as we currently do.
- Add a GHA check (but not required!) that gives us the relevant log output from mypy to actually have some common error and warning messages to discuss over.
I humbly disagree. I think its a better approach to get the project up to a minimum level and begin tightening the reigns from there. PR #512 is doing precisely that, where it starts at a mypy default minimum. The objective: Get to the green, like we do for any pytest checks.
How do we easily detect when there are introduced new typing errors? With a test that normally pass, we'd see it immediately. If it's in a list of 100 other errors, it'll drown.
Who's up for bullet number 1 to add the ugly red light?
Just running through it with "not required" doesn't create a ugly red light, does it? AFAIK the only way to get a red light is to fail the job and a red light will block any merge, doesn't it? Maybe it possible to run it in a separate GHA workflow and that the branch protection rules are updated NOT to require the job successful? Is this possible @erlend-aasland ?
- We SHOULD NOT add any elaborate mypy configuration to the project, just leave it running at default settings in the beginning.
Yes, this is the intent of #512.
- My personal first step PR would be to add a
-> None
to all class constructors which have some member annotations inside. That will get rid of the warnings about non-annotated functions not being checked. And it will instead cause a lot more errors to be reported. Which is not only OK, but a Good Thing, because it raises visibility of what needs to be done.
mypy have an excellent guide on how to migrate existing projects from default settings towards stricter https://mypy.readthedocs.io/en/stable/existing_code.html. I'm following the approach it is proposing with the start in #512. IMHO going for untyped -> None
's are not way I'd start thou. It'll generate a massive amount of errors. The basic project should be passable first before we tackle that obstacle. ...but again it boils down to the decision if its ok to commit changes that introduces mypy errors and how to detect them.
- I'm not sure it makes sense to completely decouple added annotations from actual code fixes. Sure that's a nice approach if your project is only to "add typing information". Whether you first fix the code or first add all annotations doesn't matter much - it always results in a situation where one part has to be essentially completed before meaningfully working on the other. Even if not all parts are committed / upstreamed yet.
I can see that it would be valuable to do "case" based improvements. E.g. the Network fix as proposed in PR #513. This is where annotations and functional improvements goes hand in glove.
- Wherever someone starts working on this topic, try to hit the "Create Pull Request" button before getting carried away. I know it's tempting to get rid of "all the warnings" before proposing the work for inclusion. But it's not necessary and it results in endlessly hard to review mega-patches. So please don't try to reach perfection too early.
Well, I have some experience with the road ahead and I erroneously thought that #512 was a slow and gentle start. Because in the volume of changes to come it is small. Yet, as said, this PR is too broad and needs to be split up into pieces. #512 is definitely not a perfect fix. It's a start point.
My vision of this process is to really embrace these facts: Right now we have about 5 or 10 percent "OK". The rest is missing (not annotated) or buggy (needs code adaptations). The more we annotate, the more there will be to fix. But that doesn't mean that any annotation must also immediately result in fixing all newly introduced warnings. They can very well be left piling up for some later PR or even someone else to fix them. But this way we will make progress, in small steps. There is a lot of space between our current state and whatever 100 percent we aim for. Let's reach it incrementally.
Agreed. But I hope, not too slow.
P.S. All of these changes should be of very low risk, as we will mostly add extra safety checks. As long as the code itself keeps working, just without these added precautions, we're not worse off than right now where stuff will simply explode at runtime when stepping on a type-confusion mine.
This is why it is so great that the test coverage and unittests are being improved in parallel! Yet, at some point we have to break an egg to make the omelet. If we are to fix and improve the library, we must be able to look forwards, not just curate the existing. And by this I don't mean that we should disregard stability, but we need somewhere to allow ourselves to more future looking that history glazing. My 2 cents.
P.P.S. And I don't think we will target strict mode anytime soon, if ever. That would be a goal for a different API, not this one with its heritage and exploiting dynamic typing for ease of use as we currently do.
I agree. It'd be a stretch to get to strict. We'd have to accept far more functional changes that what we've deliberated so far.
I propose we do the following:
We are several people that are actively working on it and that's really awesome. It's cool when working on a project with may contributors 💪
Yes, we agree about the disagreement :-)
How do we easily detect when there are introduced new typing errors? With a test that normally pass, we'd see it immediately. If it's in a list of 100 other errors, it'll drown.
I see a logic error in your analysis. If the test normally passes because it is muted, then it will NOT raise a flag on new errors. They will simply be muted as well. The warnings are there to be seen and fixed, not to be hidden. Better to concentrate on getting that list of 100 errors shorter. But we need a list for starters.
Just running through it with "not required" doesn't create a ugly red light, does it?
What I mean should look like this: There are required and non-required checks. An optional one does get a red cross, but doesn't fail the CI run completely. (Source: https://github.com/syncthing/syncthing/pull/9175)
...but again it boils down to the decision if its ok to commit changes that introduces mypy errors and how to detect them.
My stance on that is very clear: It is OK (right now) because it is the status quo (many typing failures) but we do release working code nevertheless. We don't need to raise that bar from many failures to zero in one single step. Let there be errors, as they still do not affect the code functionality. It's just a linter.
As a reference, have you ever checked out what the Debian GNU/Linux distribution does with their lintian
tool? The package tracking database shows warnings for many packages. That's good, because it shows people what improvements need to be worked on.
Regarding your proposal #512, sorry I simply haven't looked at it yet. Need some time which I didn't find today. That doesn't mean it's not considered at all. But it is more than my outlined first step, i.e. it goes too far yet. I just think this discussion is the bottom line of defining how we want to proceed, therefore I spent my time here instead of looking at the PR. To get the basics squared and agreed upon.
Let's try the American approach in this case: move fast, break stuff. Whoever first finds an opportunity to dive into GitHub checks configuration, please proceed with adding an optional mypy check, as outlined above. If it gets too annoying or problematic, we can always just back out of it again. In the meantime / additionally, everyone is free to run mypy locally as they see fit, with or without any silencing options.
How do we easily detect when there are introduced new typing errors? With a test that normally pass, we'd see it immediately. If it's in a list of 100 other errors, it'll drown.
I see a logic error in your analysis. If the test normally passes because it is muted, then it will NOT raise a flag on new errors. They will simply be muted as well. The warnings are there to be seen and fixed, not to be hidden. Better to concentrate on getting that list of 100 errors shorter. But we need a list for starters.
No worries, I won't hold it against you that you haven't looked at the PR yet 👍. FTR the PR doesn't mute anything. Nor does it suppress any errors, it is using the default settings of mypy. And the propsed PR (albeit too functional-y) fixes all of the open issue mypy address. When that PR is done we will be compliant and passing the mypy default. THEN we can start tightening the rope by enabling new settings that draws us towards strict(er) compliance.
Just running through it with "not required" doesn't create a ugly red light, does it?
What I mean should look like this: There are required and non-required checks.
Ah nice. Perfect.
...but again it boils down to the decision if its ok to commit changes that introduces mypy errors and how to detect them.
My stance on that is very clear: It is OK (right now) because it is the status quo (many typing failures) but we do release working code nevertheless. We don't need to raise that bar from many failures to zero in one single step. Let there be errors, as they still do not affect the code functionality. It's just a linter.
Type annotation is not just a linter, it's a methodology. It's a more restrictive application of python. Conceptually it's the same as applying black to python. Black adds opinions to how things shall be formatted. Type annotations does the same.
Regarding your proposal #512, sorry I simply haven't looked at it yet. Need some time which I didn't find today. That doesn't mean it's not considered at all. But it is more than my outlined first step, i.e. it goes too far yet. I just think this discussion is the bottom line of defining how we want to proceed, therefore I spent my time here instead of looking at the PR. To get the basics squared and agreed upon.
PR #512 contains two things: A set of annotations and a set of functional changes. From the discussions I agree it should be divided into smaller groups of changes. However, I stand by the goal and wanted end-result of implemented the essence of PR #512: Getting mypy passing into the green with default settings.
Let's try the American approach in this case: move fast, break stuff. Whoever first finds an opportunity to dive into GitHub checks configuration, please proceed with adding an optional mypy check, as outlined above. If it gets too annoying or problematic, we can always just back out of it again. In the meantime / additionally, everyone is free to run mypy locally as they see fit, with or without any silencing options.
Yes, we need the GHA in place here anyways.
Type annotation is not just a linter, it's a methodology. It's a more restrictive application of python. Conceptually it's the same as applying black to python. Black adds opinions to how things shall be formatted. Type annotations does the same.
Precisely, it concerns only the how, but not the what does the code do. That's what I meant.
However, I stand by the goal and wanted end-result of implemented the essence of PR #512: Getting mypy passing into the green with default settings.
Of course, I don't think the end result was ever disputed. Just the size of the first step(s).
What I want is to see the progress made, i.e. run mypy in GHA so everyone looks at the same output. After that is public, any improvement can be easily measured against that baseline.
FTR, the project owners must configure GHA checks as not required in the project settings; I cannot do this with a PR. I can help with other CI related stuff, though.
Regarding pace; I don't see any reason to rush things. We'll get there eventually.
I'll be focusing on test coverage for now; IMO, that's the most important task.
I'd like to make make a progress-tracking issue like #514 for type annotations and I started outlining one. However, I stopped because I think we do not agree on its definition of done[^1] so stating the issue objective and its tasks is not yet possible. More so, if there is little interest and/or capacity to consider type annotation improvement in parallel with the test improvements, its no point doing this work now.
[^1]: The first goal should to my opinion be the state which PR #512 takes us to, but we might follow a different path to achieve it.
We SHOULD NOT add any elaborate mypy configuration to the project, just leave it running at default settings in the beginning.
As a minimum, you will need to configure the minimum supported Python version for your annotations, so something like this:
python_version = 3.8
Good catch. This is a comment that should be posted in #512. Luckily adding this to the mypy config didn't change anything. Everything in that PR is compatible with 3.8. 😓
I closed #456 because it was a mess with too many changes as we agreed. I am currently very busy in my dayjob, but I plan to commit some single PRs from this work.
I'd like to revisit one option, discussed previously in this thread, namely how to deal with all the optional members etc.
Instead of a blanket change of sprinkling the code with (potentially) performance degrading behavioural changes (if: ... raise
), there is a simpler variant:
Use assert
s. I'd say this is not more of a code smell than the blanket approach currently being suggested.
Pro:
git grep
for these asserts to remove them laterCon: code smell (but so is the currently proposed alternative ⚖️)
Using if... raise
OTOH, is a lot worse:
Con:
git grep
Pro: ?
I urge you to reconsider this, as I believe using assert
s will make for smaller and more focues PRs, resulting in faster pace and less review friction.
Since everyone agrees asserts
is a code smell, it should be an easy task to address them separately in focused PRs at a later point in time.
What do you think?
With reference to the discussion in #339 with regards to type hints, I have started looking at the work needed for type annotating canopen. What is the overall ambitions with regards to implementation of type hints? What level of hinting and static code analysis are we aiming for? What tool? mypy, flake8, pyright/pylance ?
I'm a big proponent of type annotated python. It helps development, navigation in the editor works and it shows the expected types for calls and operations and thereby reducing bugs and misunderstandings. I think adding typing to canopen is the right thing to do. The benefits outweighs the disadvantages.
Having a full compliant type annotated library will reduce the degrees of freedom in the codebase. You can do more "tricks" with python than a linter is able to understand, and thus coding for it requires a degree of code conformity. canopen relies on a lot of duck typing and dynamic types (especially variables that conditionally contain
None
). This results a quite a bit of warnings and/or code constructs to avoid the warnings. - And rightly so, as the code in many cases is not executing any type checks to verify the datatype before invoking an operation. Lastly, many of these linting tools are very opinionated on code structure, so we must expect some functional changes.I've given a stab at implementing type annotation in canopen (see link below). I'm using mypy and pyright. After doing this I think we will have to face a few discussions on the correct and intended behavior on a few details. E.g. the use
None
as optional value.https://github.com/sveinse/canopen-asyncio/commit/4e8925e1be30fe6c3101e4f050579dbed989ecbb