thp / urlwatch

Watch (parts of) webpages and get notified when something changes via e-mail, on your phone or via other means. Highly configurable.
https://thp.io/2008/urlwatch/
Other
2.85k stars 350 forks source link

Unicode Encode Error #224

Closed kangaroo57 closed 6 years ago

kangaroo57 commented 6 years ago

Having an encoding error checking this address: http://odessa.web2ua.com/ Currently not using any filters. The site is Russian but I see everything is being read as UTF-8 so I'm at a loss. Does anyone have and ideas for me?

Is this similar to Issue 51?

Full error log: Traceback (most recent call last): File "/usr/local/bin/urlwatch", line 111, in <module> urlwatch_command.run() File "/usr/local/lib/python3.5/dist-packages/urlwatch/command.py", line 212, in run self.urlwatcher.close() File "/usr/local/lib/python3.5/dist-packages/urlwatch/main.py", line 96, in close self.report.finish() File "/usr/local/lib/python3.5/dist-packages/urlwatch/handler.py", line 128, in finish ReporterBase.submit_all(self, self.job_states, duration) File "/usr/local/lib/python3.5/dist-packages/urlwatch/reporters.py", line 92, in submit_all subclass(report, cfg, job_states, duration).submit() File "/usr/local/lib/python3.5/dist-packages/urlwatch/reporters.py", line 315, in submit print(self._red(line)) UnicodeEncodeError: 'ascii' codec can't encode characters in position 41-47: ordinal not in range(128)

kangaroo57 commented 6 years ago

After doing some more reaserch today I've found a way to sovle this issue. A response to Issue 51 was to use the command: env LC_ALL=en_US.UTF-8 urlwatch

This was a patch fix, but all foreign characters were represented as question marks. To truly fix the problem I had to change my system locale settings (Ubuntu 16.04). I found a fix after some searching. This answer was what I did to change my locales. I think Windows systems don't have this issue, but I'm not sure.