v-nhandt21 / Vinorm

Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syllables
Other
49 stars 15 forks source link

[BUG] when using mulprocess #9

Open phamkhactu opened 1 year ago

phamkhactu commented 1 year ago

Thanks for your great work,

when I using norm with multi process with pool: I get error:

[E] ucnv_toUChars: ICU Error "U_STRING_NOT_TERMINATED_WARNING"
[E] ucnv_toUChars: ICU Error "U_STRING_NOT_TERMINATED_WARNING"
568
249
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "data/mul_process_split_data.py", line 75, in split_data
    text_norm = TTSnorm(text)
  File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/site-packages/vinorm/__init__.py", line 29, in TTSnorm
    text=fr.read()
  File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 69: invalid start byte
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "data/mul_process_split_data.py", line 92, in <module>
    print(p.map(split_data, files))
  File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/tupk/anaconda3/envs/nlp/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 69: invalid start byte

I hop that you can check again it to make your package become supper start.

Best regards, Tu

thucth-qt commented 1 month ago

@phamkhactu did you happen to fix this issue? I have the same issue so I would appreciate that you could share a solution for this

v-nhandt21 commented 1 month ago

Thank you for visiting, there are two solutions for this:

Hotfix:

Feel free to comment for more discussion :))