WorksApplications / SudachiPy

Python version of Sudachi, a Japanese tokenizer.
Apache License 2.0
388 stars 50 forks source link

Exactly match the dict split info id #156

Closed sorami closed 3 years ago

sorami commented 3 years ago

Fixes #155

e.g,

In [1]: import re

In [2]: text = "1,名詞,数詞,*,*,*,*,イチ"

In [3]: re.match(r'U?\d+', text)
Out[3]: <re.Match object; span=(0, 1), match='1'>

In [4]: re.match(r'U?\d+$', text)