jdlorimer / incremental-reading

Anki add-on providing incremental reading features
https://ankiweb.net/shared/info/935264945
ISC License
217 stars 38 forks source link

remove breaking decode call #90

Closed jyore closed 5 years ago

jyore commented 5 years ago

I started to experiment with this addon and it seems that on a mac, pages that are not already encoded in UTF-8 will fail to import and throw an exception.

I was testing the addon using this site, which is Shift_JIS encoded. http://hukumusume.com/douwa/betu/world/07/22.htm

On load, this error is produced.

Error 
An error occurred. Please start Anki while holding down the shift key, which will temporarily disable the add-ons you have installed. 
If the issue only occurs when add-ons are enabled, please use the Tools>Add-ons menu item to disable some add-ons and restart Anki, repeating until you discover the add-on that is causing the problem. 
When you've discovered the add-on that is causing the problem, please report the issue on the add-ons section of our support site. 
Debug info:
Anki 2.1.7 (a6c34fd7) Python 3.6.7 Qt 5.12.0 PyQt 5.11.3
Platform: Mac 10.14
Flags: frz=True ao=True sv=1

Caught exception:
  File "/Users/jyore/Library/Application Support/Anki2/addons21/935264945/importer.py", line 105, in importWebpage
    webpage = self._fetchWebpage(url)
  File "/Users/jyore/Library/Application Support/Anki2/addons21/935264945/importer.py", line 52, in _fetchWebpage
    html = urlopen(url, context=context).read().decode('utf-8')
<class 'UnicodeDecodeError'>: 'utf-8' codec can't decode byte 0x83 in position 23: invalid start byte

After inspecting the code and testing, I think that the strict decode call should just be removed. Reason being that BeautifulSoup intrinsically transforms the content to UTF-8 during parsing (if possible). Because of this, it is not necessary to do strict decoding before passing the document. This is basically what the non-mac version does, in the else statement as well.

After removing the decode call, the import works fine:

_-_anki-3
jdlorimer commented 5 years ago

Thanks so much. I don't test on Mac, so I never would have picked this up. I also noticed an issue with how that website is being imported on other platforms, which I'll fix shortly.

jyore commented 5 years ago

no problem. glad i could help :) good work on the add-on so far. i am looking forward to using it more in the future.