hyunwoongko / pecab

Pecab: Pure python Korean morpheme analyzer based on Mecab
Apache License 2.0
156 stars 13 forks source link

Disk write required? #5

Closed cceyda closed 1 year ago

cceyda commented 1 year ago

I'm trying to use pecab in an AWS lambda function. But lambda has only a read-only file system and because of the mode="r+" below I get an error. Would it break anything if it was just mode="r"? (Seems to work fine for me but not tested deeply)

https://github.com/hyunwoongko/pecab/blob/2134e5d19bbf87e5bd791bfdbd389748cad64d4d/pecab/_tokenizer.py#L31-L36

I'm just using it as is, not using a user_dict.

from pecab import PeCab
pecab = PeCab()

This was the error... and for now switching to mode="r" seems to have worked

OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
[ERROR] OSError: [Errno 30] Read-only file system: '/var/task/pecab/_resources/matrix.npy'
Traceback (most recent call last):
  File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/task/app.py", line 4, in <module>
    from utils import split_words,post_filter
  File "/var/task/utils.py", line 3, in <module>
    pecab = PeCab()
  File "/var/task/pecab/_pecab.py", line 13, in __init__
    self.tokenizer = Tokenizer(user_dict, split_compound)
  File "/var/task/pecab/_tokenizer.py", line 31, in __init__
    self.conn_costs = np.memmap(
  File "/var/task/numpy/core/memmap.py", line 228, in __new__
    f_ctx = open(os_fspath(filename), ('r' if mode == 'c' else mode)+'b')

Thanks for this library 💞

hyunwoongko commented 1 year ago

Oh I think r is enough for now. I will modify that part! Thanks for letting me know :)

hyunwoongko commented 1 year ago

@cceyda It's fixed https://github.com/hyunwoongko/pecab/releases/tag/v1.0.8 Could you upgrade library using pip install pecab --upgrade ?

cceyda commented 1 year ago

that was fast~ thank you!