Open arporter opened 9 months ago
@rupertford writes:
I don't know about the documentation either so can't comment. As for 2) using 3.7+ dataclasses seems to be reasonable rather than things that are not in the standard library.
Attrs
(https://www.attrs.org/en/stable/) has been around a while now and is presumably pretty stable. It offers the significant advantage of allowing the user to provide validation methods.
I don't know anything about dataclasses etc, so can't contribute much. I couldn't even find where we are using it :(
I don't know anything about dataclasses etc, so can't contribute much. I couldn't even find where we are using it :(
We're not yet :-) @oakleybrunt has PR #2321 which currently uses a dataclass but we need to decide if that's the option we like the best.
The documentation generated by Sphinx AutoAPI for Oakley's new class looks like: So the type information is included and we could therefore agree to drop it from the class docstring.
Doxygen produces the following: It would be good if the documentation of the parameters appeared in the "Static Public Attributes" section really.
You can document the parameters but it has to be done as comments beginning with ##
:
Such comments are a Doxygen thing though and are not picked up by AutoAPI.
If I switch to using attrs
define
decorator:
which is pretty much the same as before.
Unfortunately, the Doxygen version is less good:
Regarding the dataclass vs attrs vs pydantic discussion, I think any of the 3 are better than namedtupes due to them generating documentation but dataclass can not replace our typical usage of classes because it does not check type values at runtime. So:
@dataclass
class LFRicArgStencil:
name: str
obj = LFRicArgStencil()
obj.name = 3
is valid.
The best we can do is instead use: @dataclass(frozen=True)
So obj.name = 3
will not be allowed, but also not obj.name = "another_name"
However, if we want to substitute our classes, attrs allows:
@attr.define
class LFRicArgStencil:
name: str = attr.field(validate=_validate_name, on_setattr=_validate_name)
which in simple cases like this we don't need to implement the validation function as attrs comes with attr.validators.instance_of(str)
In general is a good idea to not allow state to be modified (but we do it often). Maybe for an initial step it is enough if we use dataclasses always with frozen=True for the cases that we forbid to mutate the state, and we can decide to use another library than simplifies our validation later on?
I guess if all we're doing is replacing namedtuple then having frozen=True
is no problem? I like the fact that pydantic provides a decorator that will type-check argument calls for you but that's heavier weight. (I'd always assumed that type hints got used for that kind of thing automatically otherwise what's the point?)
BTW, the doc produced for the pydantic equivalent is not so nice either:
I'd always assumed that type hints got used for that kind of thing automatically otherwise what's the point?
Unfortunately python is too dynamic to avoid it to trick it in some way, so type hint can not be used for optimizations or runtime guarantees.
They are only for static tools, like documentation generators, linters, ...
The documentation generated by Sphinx AutoAPI for Oakley's new class looks like
I was hoping we could also remove the duplicated type inside the docstring, something like in: https://sphinx-autoapi.readthedocs.io/en/latest/how_to.html#how-to-include-type-annotations-as-types-in-rendered-docstrings
The documentation generated by Sphinx AutoAPI for Oakley's new class looks like
I was hoping we could also remove the duplicated type inside the docstring, something like in: https://sphinx-autoapi.readthedocs.io/en/latest/how_to.html#how-to-include-type-annotations-as-types-in-rendered-docstrings
That looks good. Ideally we'd do away with building the Doxygen version and just stick with Sphinx AutoAPI but I still prefer the output of Doxygen :-)
But the doxygen still looks good for method signatures with type hints? Was your "##" comment specifically about dataclass fields?
But the doxygen still looks good for method signatures with type hints? Was your "##" comment specifically about dataclass fields?
Yes, it was just for class attributes.
Didn't we already had this problem before? I think I have seen some "#:" comments that I thought it was to add documentation to class variables. Did this only work for one documentation?
Yes, sorry, I didn't mean to imply it was a new problem. I was just hoping that moving to a dataclass
might help with it.
For now, we've agreed to go with the 'standard' dataclass
. We have also agreed that such classes no longer need separate documentation of the :type: of parameters because the documentation tools get this from the type hints. I've updated the interface-markup rules (https://github.com/stfc/PSyclone/wiki/InterfaceMarkup) accordingly.
Could you update the example to use the required type-hint and dataclass usage? I've just tried using type-hints, and I noticed that the created documentation looks like this:
(i.e. the type info is not in the parameters list anymore). Is that expected, and is this what we should be doing?
I also tried specifying a PSyclone class as return type:
def get_parse_tree(self) -> :py:class:`fparser.two.Fortran2003.Program`
... in various combinations (with adding ":" in different spots). What should I do here?
Also, it would be great to add the 'expected' use of dataclass in this example.
I am happy to update the wiki, if someone confirms that what I did was what is expected :)
(i.e. the type info is not in the parameters list anymore). Is that expected, and is this what we should be doing?
Yes, that's expected and correct (since it now appears in the routine signature).
Currently, we're only anticipating using dataclass
in place of namedtuple
.
The ":py:class:" stuff is Sphinx markup, not Python. I think this will help: https://stackoverflow.com/questions/44664040/type-hints-with-user-defined-classes
So probably just -> Fortran2003.Program
if Fortran2003
has been imported.
@sergisiso writes:
Since we are starting using dataclasses and python type hints, I think we should decide how we use them in the whole codebase going forward:
Regarding type hints. If we annotate method signatures and dataclasses with them, do they appear in the generated documentation?, and if they do, should we start removing them from the docstring (leaving just the descriptions there)?
Regarding dataclasses, in the python ecosystem there are some competing libraries (e.g. attrs, pydantic), see https://stefan.sofa-rockers.org/2020/05/29/attrs-dataclasses-pydantic/#:~:text=Data%20classes%20use%20the%20optional,prohibit%20this%20for%20good%20reason. We should choose the one more convenient for us, some allow to annotate which class fields to use in equality, or frozen classes to avoid state mutability, which would allow us to remove some code and manual checks. Also note that any of them do automatic run-time type checking. We need to decide if we use validators or post_init hooks to keep our current functionality.