Open dougc333 opened 5 years ago
'1.0.0.dev20181114' = pytorch version
above modification fixed unicodedecode error
Hi dougc333
I am still facing UnicodeDecode Error after modifying line 42 in data.py. Can you tell me how to switch '1.0.0.dev20181114' = pytorch version?
Traceback (most recent call last):
File ".\working_03_gui_QA.py", line 230, in <module>
main()
File ".\working_03_gui_QA.py", line 226, in main
d=guiclass(root)
File ".\working_03_gui_QA.py", line 51, in __init__
self._c=InferSentClass()
File "C:\Users\zhant\OneDrive\Desktop\2911\System7\Sent_embed.py", line 28, in __init__
model.build_vocab_k_words(K=100000)
File "C:\Users\zhant\OneDrive\Desktop\2911\InferSent\models.py", line 146, in build_vocab_k_words
self.word_vec = self.get_w2v_k(K)
File "C:\Users\zhant\OneDrive\Desktop\2911\InferSent\models.py", line 124, in get_w2v_k
for line in f:
File "C:\Users\zhant\Anaconda3\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 7674: character maps to <undefined>
Thanks in advance
Use encoding="utf-8" in two files and three places:
Change this with open(self.w2v_path) as f:
to with open(self.w2v_path, encoding="utf-8") as f:
line 110, inside get_w2v
in models.py
Change this with open(self.w2v_path) as f:
to with open(self.w2v_path, encoding="utf-8") as f:
line 123, inside get_w2v_k
in models.py
Change this with open(glove_path) as f:
to with open(glove_path, encoding="utf-8") as f:
line 42, inside get_glove
in data.py
for me it was(On Mac) /Users/user1/Library/Python/2.7/lib/python/site-packages/backports/configparser/init.py.
I was using pip version of apache-airflow on python 2.7
In this, update the read function with def read(self, filenames, encoding="utf-8"): initially it will be something like: def read(self, filenames, encoding=None):
Use encoding="utf-8" in two files and three places:
- Change this
with open(self.w2v_path) as f:
towith open(self.w2v_path, encoding="utf-8") as f:
line 110, insideget_w2v
inmodels.py
- Change this
with open(self.w2v_path) as f:
towith open(self.w2v_path, encoding="utf-8") as f:
line 123, insideget_w2v_k
inmodels.py
- Change this
with open(glove_path) as f:
towith open(glove_path, encoding="utf-8") as f:
line 42, insideget_glove
indata.py
where to find these files
Modify line 42 data.py from with open(glove_path) as f: to with open(glove_path, encoding="utf-8") as f: