Closed eao closed 6 years ago
Hi! πThanks for posting the original question in that thread, it gave me some pointers on where to start with this.
Hmm, maybe python is having trouble with the fact that your windows is in Japanese? I'm using a UNIX-based OS so I'm not really sure about this issue.
I'll try to install a windows VM and try to replicate this bug myself, but in the meanwhile, could you try running the following commands?
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade tqdm
or just
pip3 install --upgrade pandas
pip3 install --upgrade tqdm
They should install the same packages, and since I don't think that the python3.7 update should've broken anything it should work. But in case that they also fail, could you please post the output/screenshots here? Thanks!
Thank you for the reply! Turns out it wasn't an issue with Japanese Windows (although that came into play later) but just an issue with dependencies. After fiddling around with installing Visual Studio and pandas for a while I got it working.
But then I ran into an issue with the Tab to OPF (tab2opf) step. It couldn't encode the character '\ufffd' but it mentioned it was trying to use cp932, which is Shift-JIS encoding, so I figured it was a problem with using Windows Japanese locale. So I switched back to English (US). But then, I got the same problem with a slight difference:
Reading keys: 534703keys [00:11, 47845.41keys/s]
Writing html: 0files [00:00, ?files/s]
Traceback (most recent call last):
File "tab2opf.py", line 378, in <module>
ndicts = writekeys(defns, name)
File "tab2opf.py", line 298, in writekeys
writekey(to, key, defns[key])
File "tab2opf.py", line 271, in writekey
to.write(thing[1])
File "C:\Users\Xavier\AppData\Local\Programs\Python\Python37\lib\encodings\cp1
252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position
56: character maps to <undefined>
Same issue, same character, different position if that matters, but now it is using cp1252 which I believe is the default 8-bit character encoding for English Windows. So it seems the program uses the system's default encoding for creating the relevant files, but neither cp932 nor cp1252 work correctly. Or, I still don't have the dependencies installed correctly and it's a problem with that, I'm not sure.
(Also maybe worth mentioning: in both Japanese and English (US) locale, the JSON to Tab (yomi2tab) step produced a .tab file without throwing any errors, but they were clearly different files, with the cp932 one being 176 MB and the cp1252 one being 227 MB. So that step also is affected by the default system encoding.)
Anyway, γγγγ I suppose. :stuck_out_tongue:
I'll try to make it so everything uses utf-8 and let's see how it'll run then π Btw, can you specify what you did to install pandas properly so that a future lurker that might have the same problem will be able to solve it as well?
γγγγγ ^_^
Please try to run it with the updated version and paste the output here if it fails π
It works! Got it on my Kindle working great (and on Android thanks to the instructions here. Thanks so much.
A few small things I should mention:
INFO | 06:08:05 | line 234 | <module> | Saving the results to <_io.TextIOWrapper
name='mydict.tab' mode='w' encoding='cp1252'>...
INFO | 06:08:08 | line 237 | <module> | Successfully saved the results to <_io.T
extIOWrapper name='mydict.tab' mode='w' encoding='cp1252'>, quitting the program
.
py -3 yomi2tab.py -o mydict.tab "C:\Users\Xavier\Downloads\epwing2kindle-master\yomi-output"
The Tab to OPF (tab2opf) section generates a file named "mydict.opf" whereas the next section references "dict.opf". Obviously it is trivial to just add "my" in but I thought I would mention it.
Despite producing a working file, the kindlegen section spits out multiple warnings:
Warning(htmlprocessor):W27002: HTML attribute specified in content is not suppor
ted by Kindle readers. Removing the attribute: 'onclick' in file: C:\Users\Xavie
r\Downloads\epwing2kindle-master\opf\mydict0.html
Warning(parser8):W26001: Index not supported for enhanced mobi.
Info(parser8):I12001: Enhanced mobi generation suppressed.
Info(prcgen):I1037: Mobi file built with WARNINGS!
I am not sure if this is intended and can just be ignored or if it's something with my system.
Anyway, that's about it. Thanks again.
As for installing the requirements, I installed: Python 3.7 (x64) Build Tools for Visual Studio 2017, and checked the Windows 10 and Windows 8.1 SDK during installation since some page said to. Anaconda 5.2. This may have caused more problems than it was worth but I was having a lot of trouble using pip and wheel and stuff to install pandas, and I managed to do it with the Anaconda Prompt. I don't remember the exact steps I took however.
Yeah, I think that I also had those warnings from kindlegen but ignored them since the dictionary worked just fine. I'll update the readme with the windows-specific commands you've provided and also change the encoding for saving the file in yomi2tab (pandas is supposed to used utf-8 by default but I guess when it comes to cross-platform it's best to specify all of them explicitly π )
Anyway, thanks for such a detailed answer, I'll reference this issue in the readme as well! And congrats with getting it working, have fun looking up PA on your kindle π
Actually, it seems like the output file encoding was indeed utf-8, and it just displays as system's default encoding in the log :) I'll fix that in a new update.
Hi! I am Xavier22 from the koohii thread, thanks so much for making this. I am not sure if it's okay to post technical support questions in the issues section, but I was trying to install the requirements and was not able to. I just got this screen here:
Do you have any idea what I might be doing wrong?