WorksApplications / SudachiPy

Python version of Sudachi, a Japanese tokenizer.
Apache License 2.0
392 stars 50 forks source link

Update SudachiDict_core and dartsclone version. #110

Closed kanjirz50 closed 4 years ago

kanjirz50 commented 4 years ago

It seems like that dartsclone 0.6.0 doesn't support SudachiDict_core-20191030 (latest dictionary).

Here is a traceback.

Traceback (most recent call last):
  File "/home/katakahashi/.pyenv/versions/sudachipy_env/bin/sudachipy", line 11, in <module>
    load_entry_point('SudachiPy', 'console_scripts', 'sudachipy')()
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/command_line.py", line 235, in main
    args.handler(args, args.print_usage)
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/command_line.py", line 170, in _command_tokenize
    dict_ = dictionary.Dictionary(config_path=args.fpath_setting)
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/dictionary.py", line 37, in __init__
    self._read_system_dictionary(config.settings.system_dict_path())
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/dictionary.py", line 66, in _read_system_dictionary
    dict_ = BinaryDictionary.from_system_dictionary(filename)
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/dictionarylib/binarydictionary.py", line 50, in from_system_dictionary
    args = cls._read_dictionary(filename)
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/dictionarylib/binarydictionary.py", line 45, in _read_dictionary
    lexicon = DoubleArrayLexicon(bytes_, offset)
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/dictionarylib/doublearraylexicon.py", line 42, in __init__
    self.trie.set_array(array, size)
  File "dartsclone/_dartsclone.pyx", line 16, in dartsclone._dartsclone.DoubleArray.set_array
  File "stringsource", line 646, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 347, in View.MemoryView.memoryview.__cinit__
BufferError: memoryview: underlying buffer is not writable
Exception ignored in: <bound method DoubleArrayLexicon.__del__ of <sudachipy.dictionarylib.doublearraylexicon.DoubleArrayLexicon object at 0x7f50d0eb77f0>>
Traceback (most recent call last):
  File "/home/katakahashi/own_oss/SudachiPy/sudachipy/dictionarylib/doublearraylexicon.py", line 54, in __del__
    del self.word_params
AttributeError: word_params

'dartsclone 0.7' can use 'SudachiDict-core 20191030'. I updated dartsclone and SudachiDict-core version in requirements.txt and setup.py.

izziiyt commented 4 years ago

Thaks PR !

I needed but I couldn't update SudachiPy side because of my own business and another research for these trouble.

Your modification is almost enough for recent my update to dartsclone but I'm cocerning that your description is different from what we really reolved for what problem.

You wrote bellow but it's out of the point.

It seems like that dartsclone 0.6.0 doesn't support SudachiDict_core-20191030 (latest dictionary).

We have problem and resolution here https://sudachi-dev.slack.com/archives/CBCF278AC/p1575185307012400

https://github.com/WorksApplications/SudachiPy/issues/99

Could you wait a moment for me making PR ? It may be almost same code modification but I want to conclude these problems with correct information. I'd very glad to your mind to contribute to our community but this problem is already under resoving and just before summarizing :disappointed_relieved:

We should share what we are doing with using issues...

izziiyt commented 4 years ago

@kanjirz50 But I have a question ! Could you makeSudachipy 0.4.* works with dartsclone 0.6.0 with SudachiDict_core-20190927.tar.gz ? If so it's out of my understanding...

kanjirz50 commented 4 years ago

@izziiyt Thanks for your reply!

Please don't worry about my PR. Sudachi is a great OSS, so I want to contribute as much as possible. I appreciate your swift PR #111 .

That slack thread shows the same error message. Sorry for I couldn't find out it on slack.

I also checked sudachipy 0.4.2 with dartsclone 0.6.0 with SudachiDict_core-20190927.tar.gz, but it didn't work in my environment.

Python 3.6.9, Ubuntu 16.04, pip freeze

Cython==0.29.14
dartsclone==0.6
sortedcontainers==2.1.0
SudachiDict-core==20190927
SudachiPy==0.4.2

If I update dartsclone 0.7.0 , it works..