Closed qinzzz closed 2 years ago
Let's:
Ignore pylint for the specific line.
Alternatively, you can use an lru_cache
with size=1
to lazy-construct the regex objects:
@functools.lru_cache(1)
def _get_unicode_regex() -> UnicodeRegex:
return UnicodeRegex()
def bleu_transformer_tokenize(...):
uregex = _get_unicode_regex()
...
Also, it seems that the regex
package supports the \p{...}
Unicode properties regex syntax, and it's already a dependency. I haven't profiled it but it seems to me that compiling a regex string like that should be much faster, and we might not need to use the lazy-construct trick at all.
Merging #349 (304398f) into master (d98c2c2) will increase coverage by
0.00%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #349 +/- ##
=======================================
Coverage 80.37% 80.37%
=======================================
Files 135 135
Lines 11243 11247 +4
=======================================
+ Hits 9036 9040 +4
Misses 2207 2207
Impacted Files | Coverage Δ | |
---|---|---|
texar/torch/evals/bleu_transformer.py | 97.87% <100.00%> (+0.09%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update d98c2c2...304398f. Read the comment docs.
Only create a
UnicodeRegex
object when the functionbleu_transformer_tokenize
is called.