dmmiller612 / lecture-summarizer

Lecture summarization with BERT
https://arxiv.org/abs/1906.04165
146 stars 36 forks source link

unicode error #1

Open articstranger opened 5 years ago

articstranger commented 5 years ago

Hi I've been trying out the program but i run into a unicode error when trying out certain files. I have tried to follow their advice and put the encoding at several places but it doesnt seem to work. Do you know where i should change it?

The error message is as follows: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.7/bin/lecture-summarizer", line 10, in sys.exit(run()) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 173, in run factoryargs.action() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 52, in call self.run() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 79, in run to_upload = self.__get_lecture_content() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 65, in __get_lecture_content req = requests.post(url, all_data) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/api.py", line 116, in post return request('post', url, data=data, json=json, kwargs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/api.py", line 60, in request return session.request(method=method, url=url, kwargs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/sessions.py", line 533, in request resp = self.send(prep, send_kwargs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/sessions.py", line 646, in send r = adapter.send(request, kwargs) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/adapters.py", line 449, in send timeout=timeout File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen chunked=chunked) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/connectionpool.py", line 355, in _make_request conn.request(method, url, **httplib_request_kw) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1244, in request self._send_request(method, url, body, headers, encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1289, in _send_request body = _encode(body, 'body') File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 170, in _encode (name.title(), data[err.start:err.end], name)) from None UnicodeEncodeError: 'latin-1' codec can't encode character '\u2013' in position 476: Body ('–') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

dmmiller612 commented 5 years ago

I could look at enforcing unicode, but if you have a non-unicode item, it may cause issues.