tmilovanov / wisecreator

Utility for adding word wise information to non amazon books
317 stars 52 forks source link

UnicodeDecodeError #18

Closed kintul closed 3 years ago

kintul commented 5 years ago

Hi,

Please see the error below, while running the python file

F:\enable_wordwise>python main.py Train_To_Pakistan_-_Khushwant_Singh.mobi
[.] Checking dependenices
[.] Converting mobi 2 mobi to generate ASIN
[.] Getting ASIN
[.] Getting rawml content of the book
[.] Collecting words
[.] Count of words: 43758
Traceback (most recent call last):
  File "main.py", line 396, in <module>
    main()
  File "main.py", line 359, in main
    f = f.read().decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 11506: invalid continuation byte

It seems to be some error with senses.csv , appriciated a quick look

zpcc commented 5 years ago

I think the script can work well if your senses.csv is encoded using UTF-8.

You can also try to replace the senses.csv with this one(sha1: aaa0f29ab232bd7ae9411c3897826eaac8a66e5d).

tmilovanov commented 3 years ago

Fixed in v1.1