mjpost / sacrebleu

Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
Apache License 2.0
1.03k stars 162 forks source link

Explicitly list the exported symbols #249

Closed dustalov closed 9 months ago

dustalov commented 9 months ago

Due to the lack of an explicit list of exported symbols in the __all__ variable, a strict mode of mypy and possibly other type checkers emit errors about implicit exports:

error: Module "sacrebleu" does not explicitly export attribute "CHRF"  [attr-defined]

This pull request addresses this problem by exporting all the symbols imported in the top-level __init__.py.

dustalov commented 9 months ago

Could you run the check-build workflow for this PR? I have not modified the check scripts, but did not run the action in my fork repository because I committed the code before I noticed that I had to approve the workflows.

martinpopel commented 9 months ago

The check-build is OK. @ozancaglayan and @mjpost: let us know if you have some comments, otherwise I will merge this PR tomorrow. I was not sure if some of the symbols are not internal or legacy, so that their usage should be deprecated (and after the deprecation period, we should remove them from this __init__.py). However, this question is orthogonal to this PR and I agree code using sacrebleu should not result in any mypy --strict errors caused by SacreBLEU.

@dustalov: Please delete also the # noqa: F401 comments. They won't be needed once all the symbols are explicitly exported (and thus used) in __all__.

@dustalov: Do you plan to run mypy also on sacrebleu (or only on your libraries which use sacrebleu)? If yes, we should perhaps also prevent another mypy error (of the same type) by replacing:

from .metrics import BLEU, CHRF, TER

with

from .metrics.bleu import BLEU
from .metrics.chrf import CHRF
from .metrics.ter import TER

Am I right?

dustalov commented 9 months ago

Thank you very much for merging this! Sorry for not responding quicker, my company runs an onsite gathering this week, which reduced my screen time dramatically.

I can submit the removal of # noqa: F401 comments in a separate PR, and I used mypy to check types of my code that uses SacreBLEU. My version passes the check well, but I have not tried it for the entire codebase. I think it should be a part of the CI process.

martinpopel commented 9 months ago

I understand, my time for SacreBLEU is also very limited. We could have another PR that would fix some/all of the errors/warnings in SacreBLEU reported by flake8 and mypy. In the end I've decided that such PR is not related to this PR, so I should not wait with merging.