Open PhilipMay opened 2 months ago
Also tagging @sgugger because we had this discussion on the linked issue.
Thanks for reporting!
I think it would be better to fix the type info in the docstring instead of removing it everywhere. wdyt @LysandreJik?
Hey @PhilipMay! We're very open to having PR fixes for these kind of issues indeed; it can't be exact everywhere but that's something that we can live with imo
We're very open to having PR fixes for these kind of issues indeed;
@LysandreJik and how exactly would you suggest to fix this? My suggestion would be to add proper typing when possible and then remove it from the docstring to avoid redundancies and contradictions. What do you think?
I am asking because @stevhliu suggests to "fix the type info in the docstring instead" (see above).
The two serve different purposes: the docstring is used in the documentation displayed on the website, whereas the exact type can be used by type checkers commonly found in IDEs
I don't think we want to remove one or the other, but identifying the types which are mismatching like you just did is worthwhile (and could be automated, if we identify it as a source of error worth automating)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
the docstring is used in the documentation displayed on the website, whereas the exact type can be used by type checkers commonly found in IDEs
@LysandreJik Well, the exact type is also displayed on the documentation pages. Even if you do not specify and type in the docstring. This exactly creates the redundancy I would suggest to remove.
I don't think we want to remove one or the other,
Well, I would suggest to remove the type from the docstrings because the proper Python typing is also displayed on the doc pages.
cc @ArthurZucker as you're reworking the docstrings right now, WDYT?
Yeah, sounds good, I am relying on the typing anyways! See #33771
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
not relevant here
Who can help?
@stevhliu
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Typing on some places is wrong and contradicting.
Examples:
The float type here is wrong:
https://github.com/huggingface/transformers/blob/78b2929c0554b79e0489b451ce4ece14d265ead2/src/transformers/models/mixtral/modeling_mixtral.py#L127
It should be
Union[torch.Tensor, int]
instead.As far as I know is that HF does not want to support MyPy - see https://github.com/huggingface/transformers/issues/17528#issuecomment-1145010337 - that is ok...
But nevertheless. Providing wrong type info is highly confusing and can cause problems in the code using transformers.
I also see contradicting type info. Example:
This line says
gate_logits: torch.Tensor
:https://github.com/huggingface/transformers/blob/78b2929c0554b79e0489b451ce4ece14d265ead2/src/transformers/models/mixtral/modeling_mixtral.py#L126
but the docstring says
Union[torch.Tensor, Tuple[torch.Tensor]
- see:https://github.com/huggingface/transformers/blob/78b2929c0554b79e0489b451ce4ece14d265ead2/src/transformers/models/mixtral/modeling_mixtral.py#L136
This is contradicting and confusing.
... and both is wrong. The correct type is:
Union[torch.Tensor, Tuple[torch.Tensor, None]
.I think there are many similar issues in the transformers lib. My suggestion is:
Remove the type info from all docstrings because it is redundant and needs additional afford to be maintained.
I would also suggest to use MyPy in some way to find (and fix) at least some type issues that can cause confusion.
Expected behavior
When type info is provided it should be correct and not contradicting the docstring.