Closed AlonAizescu closed 5 years ago
my error message is: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 615: invalid start byte
It helped me to specify encoding in the opening function (ln. 27) of the file:
with open(filename, encoding="utf-8") as f:
instead of with open(filename) as f:
Is this solved? when I run the test script, I also get the unicode deocde error.
export LC_ALL= en_US.UTF-8 This can solve this problem in my case.
export LC_ALL= en_US.UTF-8 This can solve this problem in my case.
The country / language code would have to be different for each language, wouldn’t it?
Hi, I tried to run the "1 Billion Word Benchmark" example and I got the following error message:
C:\Users\Alon\Anaconda3\lib\site-packages\h5py__init.py:36: FutureWarning: Conversion of the second argument of issubdtype from
main(args)
File "bilm-tf-master/bin/train_elmo.py", line 12, in main
vocab = load_vocab(args.vocab_file, 50)
File "C:\Users\Alon\Anaconda3\lib\site-packages\bilm-0.1.post5-py3.6.egg\bilm\training.py", line 1060, in load_vocab
validate_file=True)
File "C:\Users\Alon\Anaconda3\lib\site-packages\bilm-0.1.post5-py3.6.egg\bilm\data.py", line 117, in init
super(UnicodeCharsVocabulary, self).init(filename, **kwargs)
File "C:\Users\Alon\Anaconda3\lib\site-packages\bilm-0.1.post5-py3.6.egg\bilm\data.py", line 29, in init__
for line in f:
File "C:\Users\Alon\Anaconda3\lib\encodings\cp1255.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0xff in position 1: character maps to
float
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
. from ._conv import register_converters as _register_converters Traceback (most recent call last): File "bilm-tf-master/bin/train_elmo.py", line 73, in