olety / epwing2kindle

Converting EPWING dictionaries into something you can actually use (mobi)
GNU General Public License v3.0
73 stars 3 forks source link

Trouble installing requirements #1

Closed eao closed 6 years ago

eao commented 6 years ago

Hi! I am Xavier22 from the koohii thread, thanks so much for making this. I am not sure if it's okay to post technical support questions in the issues section, but I was trying to install the requirements and was not able to. I just got this screen here:

image

Do you have any idea what I might be doing wrong?

olety commented 6 years ago

Hi! πŸ˜„Thanks for posting the original question in that thread, it gave me some pointers on where to start with this.

Hmm, maybe python is having trouble with the fact that your windows is in Japanese? I'm using a UNIX-based OS so I'm not really sure about this issue.

I'll try to install a windows VM and try to replicate this bug myself, but in the meanwhile, could you try running the following commands?

python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade tqdm

or just

pip3 install --upgrade pandas
pip3 install --upgrade tqdm

They should install the same packages, and since I don't think that the python3.7 update should've broken anything it should work. But in case that they also fail, could you please post the output/screenshots here? Thanks!

eao commented 6 years ago

Thank you for the reply! Turns out it wasn't an issue with Japanese Windows (although that came into play later) but just an issue with dependencies. After fiddling around with installing Visual Studio and pandas for a while I got it working.

But then I ran into an issue with the Tab to OPF (tab2opf) step. It couldn't encode the character '\ufffd' but it mentioned it was trying to use cp932, which is Shift-JIS encoding, so I figured it was a problem with using Windows Japanese locale. So I switched back to English (US). But then, I got the same problem with a slight difference:

Reading keys: 534703keys [00:11, 47845.41keys/s]
Writing html: 0files [00:00, ?files/s]
Traceback (most recent call last):
  File "tab2opf.py", line 378, in <module>
    ndicts = writekeys(defns, name)
  File "tab2opf.py", line 298, in writekeys
    writekey(to, key, defns[key])
  File "tab2opf.py", line 271, in writekey
    to.write(thing[1])
  File "C:\Users\Xavier\AppData\Local\Programs\Python\Python37\lib\encodings\cp1
252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position
56: character maps to <undefined>

Same issue, same character, different position if that matters, but now it is using cp1252 which I believe is the default 8-bit character encoding for English Windows. So it seems the program uses the system's default encoding for creating the relevant files, but neither cp932 nor cp1252 work correctly. Or, I still don't have the dependencies installed correctly and it's a problem with that, I'm not sure.

(Also maybe worth mentioning: in both Japanese and English (US) locale, the JSON to Tab (yomi2tab) step produced a .tab file without throwing any errors, but they were clearly different files, with the cp932 one being 176 MB and the cp1252 one being 227 MB. So that step also is affected by the default system encoding.)

Anyway, γ‚ˆγ‚Šγ—γ I suppose. :stuck_out_tongue:

olety commented 6 years ago

I'll try to make it so everything uses utf-8 and let's see how it'll run then πŸ˜„ Btw, can you specify what you did to install pandas properly so that a future lurker that might have the same problem will be able to solve it as well?

γ‚ˆγ‚γ—γγ­ ^_^

olety commented 6 years ago

Please try to run it with the updated version and paste the output here if it fails 😊

eao commented 6 years ago

It works! Got it on my Kindle working great (and on Android thanks to the instructions here. Thanks so much.

A few small things I should mention:

  1. It is reported that the encoding for the JSON to Tab (yomi2tab) portion is still done in the system default encoding (for English US Windows, cp1252). See below. Since everything encodes properly in the end this isn't an issue but I am not sure if it is actually encoding in cp1252 or not.
INFO | 06:08:05 | line 234 | <module> | Saving the results to <_io.TextIOWrapper
 name='mydict.tab' mode='w' encoding='cp1252'>...
INFO | 06:08:08 | line 237 | <module> | Successfully saved the results to <_io.T
extIOWrapper name='mydict.tab' mode='w' encoding='cp1252'>, quitting the program
.
  1. A few commands like "python3 yomi2tab.py -o mydict.tab yomi_output/" I was not able to run as-is; at least on my system it did not accept "python3" as an internal or external command. The following worked though:
py -3 yomi2tab.py -o mydict.tab "C:\Users\Xavier\Downloads\epwing2kindle-master\yomi-output"
  1. The Tab to OPF (tab2opf) section generates a file named "mydict.opf" whereas the next section references "dict.opf". Obviously it is trivial to just add "my" in but I thought I would mention it.

  2. Despite producing a working file, the kindlegen section spits out multiple warnings:

Warning(htmlprocessor):W27002: HTML attribute specified in content is not suppor
ted by Kindle readers. Removing the attribute: 'onclick' in file: C:\Users\Xavie
r\Downloads\epwing2kindle-master\opf\mydict0.html
Warning(parser8):W26001: Index not supported for enhanced mobi.
Info(parser8):I12001: Enhanced mobi generation suppressed.
Info(prcgen):I1037: Mobi file built with WARNINGS!

I am not sure if this is intended and can just be ignored or if it's something with my system.

Anyway, that's about it. Thanks again.

eao commented 6 years ago

As for installing the requirements, I installed: Python 3.7 (x64) Build Tools for Visual Studio 2017, and checked the Windows 10 and Windows 8.1 SDK during installation since some page said to. Anaconda 5.2. This may have caused more problems than it was worth but I was having a lot of trouble using pip and wheel and stuff to install pandas, and I managed to do it with the Anaconda Prompt. I don't remember the exact steps I took however.

olety commented 6 years ago

Yeah, I think that I also had those warnings from kindlegen but ignored them since the dictionary worked just fine. I'll update the readme with the windows-specific commands you've provided and also change the encoding for saving the file in yomi2tab (pandas is supposed to used utf-8 by default but I guess when it comes to cross-platform it's best to specify all of them explicitly πŸ˜… )

Anyway, thanks for such a detailed answer, I'll reference this issue in the readme as well! And congrats with getting it working, have fun looking up PA on your kindle πŸ‘

olety commented 6 years ago

Actually, it seems like the output file encoding was indeed utf-8, and it just displays as system's default encoding in the log :) I'll fix that in a new update.