pettarin / penelope

Penelope is a multi-tool for creating, editing and converting dictionaries, especially for eReader devices
MIT License
201 stars 31 forks source link

"Unable to find vcvarsall.bat" during "pip install penelope" under Windows 7 #16

Closed jg2944 closed 8 years ago

jg2944 commented 8 years ago

Hello, I got the "Unable to find vcvarsall.bat" error mesage during "pip install penelope" under Windows 7 ;

nevertheless, the "pip list" command shows penelope (3.1.2.0) in the list ;

Is it serious, doctor ? ;-) or can this error message produced during the Penelope installation be ignored ?

thanks in advance

I have done a quick test to convert an ES-to-EN stardict dictionnary into a Bookeen format and got some errors (see below) : just wanted to know if the previous installation error message could be related to the errors shown below

C:\dictio\python -m penelope -i stardict-spanish-english-2.4.2.zip -j stardict -f es -t en -p bookeen -o output

[INFO] Reading input file(s)... [INFO] Reading input file(s)... done [INFO] Writing output file(s)... Traceback (most recent call last): File "C:\PYTHON27\lib\runpy.py", line 162, in _run_module_as_main "main", fname, loader, pkg_name) File "C:\PYTHON27\lib\runpy.py", line 72, in _run_code exec code in run_globals File "C:\PYTHON27\lib\site-packages\penelopemain.py", line 146, in main() File "C:\PYTHON27\lib\site-packages\penelopemain.py", line 133, in main output_paths = write_dictionary(dictionary, arguments) File "C:\PYTHON27\lib\site-packages\penelope\dictionary.py", line 103, in write_dictionary return penelope.format_bookeen.write(dictionary, args, args.output_file) File "C:\PYTHON27\lib\site-packages\penelope\format_bookeen.py", line 227, in write sql_cursor.execute("insert into T_DictIndex values (?,?,?,?,?)", sql_tuple) File "C:\PYTHON27\lib\site-packages\penelope\collation_default.py", line 28, in collate_function b2 = string2.encode("utf-8").lower() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 12: ordinal not in range(128)

thanks in advance

pettarin commented 8 years ago

These are two different issues.

  1. "Unable to find vcvarsall.bat" means that pip is not able to find the MS C compiler for Python to compile one of the dependencies (lxml or marisa-trie). If you do not plan to I/O in XML or Kobo format, you can ignore the error. Otherwise, you need to download this: https://www.microsoft.com/en-us/download/details.aspx?id=44266 and run pip in the special command prompt provided by it.
  2. It looks like your shell and/or input file is not UTF-8. If the former, you can try giving the following command:
> set PYTHONIOENCODING=UTF-8
> python -m penelope -i stardict-spanish-english-2.4.2.zip -j stardict -f es -t en -p bookeen -o output

before executing Penelope. If the latter, you need to provide the encoding of the input file with the --input-file-encoding flag. For example:

> python -m penelope -i stardict-spanish-english-2.4.2.zip -j stardict -f es -t en -p bookeen -o output --input-file-encoding latin1
jg2944 commented 8 years ago

Thanks a lot, Alberto ! Btw, any idea where I can find spanish or spanish / french dictionnaries that can be used as input file of Penelope ? Thanks again Joel Le 23 févr. 2016 20:12, "Alberto Pettarin" notifications@github.com a écrit :

These are two different issues.

1.

"Unable to find vcvarsall.bat" means that pip is not able to find the MS C compiler for Python to compile one of the dependencies (lxml or marisa-trie). If you do not plan to I/O in XML or Kobo format, you can ignore the error. Otherwise, you need to download this: https://www.microsoft.com/en-us/download/details.aspx?id=44266 and run pip in the special command provided by it. 2.

It looks like your shell and/or input file is not UTF-8. If the former, you can try giving the following command:

set PYTHONIOENCODING=UTF-8 python -m penelope -i stardict-spanish-english-2.4.2.zip -j stardict -f es -t en -p bookeen -o output

before executing Penelope. If the latter, you need to provide the encoding of the input file with the --input-file-encoding flag. For example:

python -m penelope -i stardict-spanish-english-2.4.2.zip -j stardict -f es -t en -p bookeen -o output --input-file-encoding latin1

— Reply to this email directly or view it on GitHub https://github.com/pettarin/penelope/issues/16#issuecomment-187846664.

pettarin commented 8 years ago

Searching for "stardict spanish dictionary" in Google returns several hits. I have no idea about their quality or copyright status.

dan3000 commented 8 years ago

Hey mate,

first of all I'd like to thank you for this great software. I've got the same problem of jg2944:

Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "main", fname, loader, pkg_name)

... File "/usr/lib/python2.7/gzip.py", line 34, in open return GzipFile(filename, mode, compresslevel) File "/usr/lib/python2.7/gzip.py", line 136, in init self._write_gzip_header() File "/usr/lib/python2.7/gzip.py", line 181, in _write_gzip_header self.fileobj.write(fname + '\000') UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

I've tried your suggestions but I still get this output. No problem when generating a .csv file

Edit:

the command I'm giving is: python -m penelope -i /mypath/Babylon_English_French/Babylon_English_French.zip -j stardict -f fr -t en -p kobo -o frenchkobo

pettarin commented 8 years ago

Did you try:

$ export PYTHONIOENCODING=UTF-8
$ python -m penelope etc. etc.

? (Note: set is for Windows, export for Linux/OS X. Setting PYTHONIOENCODING will override your shell encoding, just for Python.)

From what I see in the call trace, it looks like the error generates when the gzip module attempts to write a file to disk, probably containing some non-ASCII characters in its name, and that may be the case if you are running in a console with a non-UTF-8 encoding.

BTW, to find out what encoding Python is currently using, you can:

$ python
>>> import sys
>>> sys.stdin.encoding
'UTF-8' (or something else)
>>> sys.stdout.encoding
'UTF-8' (or something else)
dan3000 commented 8 years ago

Thank you for your super fast reply :)

You're right, I'm on Linux, I did the export but no news.. Yes, it's just the last step that doesn't work mate :/

The python output is:

[quote] >>> import sys

sys.stdin.encoding 'UTF-8' sys.stdout.encoding 'UTF-8'

[/quote]

I've tried also the option --input-file-encoding latin and --input-file-encoding ascii without success.. Should I edit the script perhaps?

pettarin commented 8 years ago

If I cannot reproduce the issue, I could not say what is going wrong.

Can you mail me the input dictionary (or a link to Dropbox/Drive/Box to it)?

dan3000 commented 8 years ago

You're right, here it go: DELETED LINK

pettarin commented 8 years ago

@dan3000 thank you. I deleted your link, as I am not sure the dictionary is 100% copyright free.

Nevertheless, on my laptop I do not get your error:

$ python -m penelope -i bef.zip -j stardict -f en -t fr -p kobo -o dicthtml-en-fr.zip
[INFO] Reading input file(s)...
[INFO] Reading input file(s)... done
[INFO] Writing output file(s)...
[INFO] Writing output file(s)... done
[INFO] The following file(s) have been created:
[INFO]   dicthtml-en-fr.zip

Please send me an email, I will send you the file for you to test on your Kobo.

pettarin commented 8 years ago

Closing this issue, to avoid polluting it. Feel free to open another issue.