Closed chezou closed 3 years ago
Thank you for the report!
We will think about how we should manage the dictionary; one idea we have is to use a config file instead of symlink.
Good to know. If you need a test for Windows, I'd be happy to help you!
Eventually, I've found a way to create a symlink with user permission on Windows 10.
As of Python 3.8, os.symlink()
supports to create a symlink with unprivileged account if Developer Mode enabled.
See also the note of: https://docs.python.org/3/library/os.html#os.symlink
Here is the result of the example with Python 3.8 on Windows 10.
C:\Users\chezo\source\sudachi-test
λ python
Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from sudachipy import tokenizer
>>> from sudachipy import dictionary
>>>
>>> tokenizer_obj = dictionary.Dictionary().create()
>>>
>>> mode = tokenizer.Tokenizer.SplitMode.C
>>> [m.surface() for m in tokenizer_obj.tokenize("国家公務員", mode)]
['国家公務員']
>>>
>>> mode = tokenizer.Tokenizer.SplitMode.B
>>> [m.surface() for m in tokenizer_obj.tokenize("国家公務員", mode)]
['国家', '公務員']
>>>
>>> mode = tokenizer.Tokenizer.SplitMode.A
>>> [m.surface() for m in tokenizer_obj.tokenize("国家公務員", mode)]
['国家', '公務', '員']
>>>
>>> m = tokenizer_obj.tokenize("食べ", mode)[0]
>>>
>>> m.surface() # => '食べ'
'食べ'
>>> m.dictionary_form() # => '食べる'
'食べる'
>>> m.reading_form() # => 'タベ'
'タベ'
>>> m.part_of_speech() # => ['動詞', '一般', '*', '*', '下一段-バ行', '連用形-一般']
['動詞', '一般', '*', '*', '下一段-バ行', '連用形-一般']
>>>
>>>
>>> # Normalization
...
>>> tokenizer_obj.tokenize("附属", mode)[0].normalized_form()
'付属'
>>> # => '付属'
... tokenizer_obj.tokenize("SUMMER", mode)[0].normalized_form()
'サマー'
>>> # => 'サマー'
... tokenizer_obj.tokenize("シュミレーション", mode)[0].normalized_form()
'シミュレーション'
@chezou Good workaround ! We'll resolve this problem for Windows OS and python3.5 but if someone wants to use SudachiPy just now, follow this way. Thanks @chezou .
0.6.0 does not use symlinks anymore
SudachiPy doesn't work with Windows since Windows requires administrator privilege for creating symlink. It'd be nice if we could avoid using symlink for dictionary setting.