numbbo / coco

Numerical Black-Box Optimization Benchmarking Framework
https://numbbo.github.io/coco
Other
260 stars 86 forks source link

[Bug report] UnicodeDecodeError when installing Python version #2025

Open MrDomani opened 2 years ago

MrDomani commented 2 years ago

When installing the Python version of coco by step 3 (running python do.py run-python in console) the cmd brings following output:

AML ['code-experiments/src/coco_random.c', 'code-experiments/src/coco_suite.c', 'code-experiments/src/coco_observer.c', 'code-experiments/src/coco_archive.c', 'code-experiments/src/coco_runtime_c.c'] -> code-experiments/build/python/cython/coco.c EXPAND code-experiments/build/python/cython/coco.c.in to code-experiments/build/python/cython/coco.c EXPAND code-experiments/src/coco.h to code-experiments/build/python/cython/coco.h COPY code-experiments/src/bbob2009_testcases.txt -> code-experiments/build/python/bbob2009_testcases.txt COPY code-experiments/src/bbob2009_testcases2.txt -> code-experiments/build/python/bbob2009_testcases2.txt COPY code-experiments/build/python/README.md -> code-experiments/build/python/README.txt EXPAND code-experiments/build/python/setup.py.in to code-experiments/build/python/setup.py PYTHON setup.py install in code-experiments\build\python Traceback (most recent call last): File "do.py", line 1032, in main(sys.argv[1:]) File "do.py", line 1001, in main elif cmd == 'build-python': build_python(package_install_option = package_install_option) File "do.py", line 326, in build_python python(join('code-experiments', 'build', 'python'), ['setup.py', 'install'] File "D:\Programy\coco\code-experiments\tools\cocoutils.py", line 144, in python output = check_output_with_print(verbose, full_command, stderr=STDOUT, env=os.environ, File "D:\Programy\coco\code-experiments\tools\cocoutils.py", line 43, in check_output_with_print output = check_output(*popenargs, *kwargs) File "E:\anaconda\lib\subprocess.py", line 415, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "E:\anaconda\lib\subprocess.py", line 495, in run stdout, stderr = process.communicate(input, timeout=timeout) File "E:\anaconda\lib\subprocess.py", line 1015, in communicate stdout = self.stdout.read() File "E:\anaconda\lib\encodings\cp1250.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x88 in position 2099: character maps to

Tried reinstalling anaconda, deinstalling anaconda and running with pure python, executing from system cmd, Powershell and their Anaconda counterparts. Also changed system language to English (normally is in Polish). The same bug appears everytime.

System:

nikohansen commented 2 years ago

FTR, 0x88 seems to be the ê character. I don't quite understand why we would see this character even with Polish locale.

A possible quick fix: replace line 145 of coco\code-experiments\tools\cocoutils.py

                              universal_newlines=True)

with

                              universal_newlines=True, encoding=sys.stdout.encoding)

If this doesn't work, you could try encoding='utf-8'.

MrDomani commented 2 years ago

Tried both, both return a slightly different error. (In the meantime I have reinstalled anaconda):

(base) D:\Programy\coco>python do.py run-python AML ['code-experiments/src/coco_random.c', 'code-experiments/src/coco_suite.c', 'code-experiments/src/coco_observer.c', 'code-experiments/src/coco_archive.c', 'code-experiments/src/coco_runtime_c.c'] -> code-experiments/build/python/cython/coco.c EXPAND code-experiments/build/python/cython/coco.c.in to code-experiments/build/python/cython/coco.c EXPAND code-experiments/src/coco.h to code-experiments/build/python/cython/coco.h COPY code-experiments/src/bbob2009_testcases.txt -> code-experiments/build/python/bbob2009_testcases.txt COPY code-experiments/src/bbob2009_testcases2.txt -> code-experiments/build/python/bbob2009_testcases2.txt COPY code-experiments/build/python/README.md -> code-experiments/build/python/README.txt EXPAND code-experiments/build/python/setup.py.in to code-experiments/build/python/setup.py PYTHON setup.py install in code-experiments\build\python Traceback (most recent call last): File "do.py", line 1032, in main(sys.argv[1:]) File "do.py", line 1009, in main elif cmd == 'run-python': run_python(also_test_python, package_install_option = package_install_option) File "do.py", line 335, in run_python build_python(package_install_option=package_install_option) File "do.py", line 326, in build_python python(join('code-experiments', 'build', 'python'), ['setup.py', 'install'] File "D:\Programy\coco\code-experiments\tools\cocoutils.py", line 144, in python output = check_output_with_print(verbose, full_command, stderr=STDOUT, env=os.environ, File "D:\Programy\coco\code-experiments\tools\cocoutils.py", line 43, in check_output_with_print output = check_output(*popenargs, *kwargs) File "D:\anaconda3\lib\subprocess.py", line 415, in check_output return run(popenargs, stdout=PIPE, timeout=timeout, check=True, File "D:\anaconda3\lib\subprocess.py", line 495, in run stdout, stderr = process.communicate(input, timeout=timeout) File "D:\anaconda3\lib\subprocess.py", line 1015, in communicate stdout = self.stdout.read() File "D:\anaconda3\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 1612: invalid start byte

nikohansen commented 2 years ago

The check_output doc writes that "Text mode is triggered by setting any of text, encoding, errors or universal_newlines.", that is, the problem should not be a bytes to string conversion on the coco side.

A desperate idea: use encoding='polish'?

MrDomani commented 2 years ago

Tried it. Unfortunately. throws error even faster with last message 'unknown encoding: polish'.

nikohansen commented 2 years ago

Python unicode issues are infamously annoying.

Another desperate idea: change the locale setting? I have:

import locale
locale.getlocale()
(None, 'UTF-8')

What do you get?

MrDomani commented 2 years ago

They sure are. Here's what I get:

('Polish_Poland', '1250')

nikohansen commented 2 years ago

This suggests, to try '1250' or (equivalently) locale.getlocale()[1] as encoding in line 145 of coco\code-experiments\tools\cocoutils.py, or possibly use locale.setlocal to change localization.

MrDomani commented 2 years ago

Still no good news. Puttting '1250' results in the same error as in my first comment. Setting locale.setlocale(locale.LC_ALL, locale.getlocale()[0]+'.utf-8') at the start of do.py and encoding='utf-8' in check_output_with_print(...) returns the same error as in my second message. The position of wrong byte changes though.

brockho commented 2 years ago

Hi. One crazy thought (of an active Windows user without much notion of what's going on :-) ): could the error come from the fact that coco code and Anaconda installation are on different hard disks? I had recently a hickup when trying to use a git checkout from an external hard disk which worked well on the internal one...

MrDomani commented 2 years ago

Right now they're both on the same partition of the same disk. Thanks for the idea though

nikohansen commented 2 years ago

Another idea: remove the universal_newlines and encoding arguments in line 145 of coco\code-experiments\tools\cocoutils.py all together. Like this, the output is supposed to be bytes and not str and hopefully, maybe no decoding is applied at all and hence it can not fail.

MrDomani commented 2 years ago

We've got a breakthrough. Running it without universal_newlines and encoding arguments finishes without error and running example_experiment_for_beginners.py subsequently works fine. I did some thinking. I've checked encodings of all files modified by command python do.py run-python. Their encoding was mostly UTF-8, but some (including those in code-experiments/tools/__pycache__, which I suspect were generated by check_output) were in ANSI encoding. Stack told me to try Windows-1252 encoding. So I've restored the folder with coco repo, added in cocoutils.py in line 145 universal_newlines=True, encoding='cp1252', run the command python do.py run-python... And it worked. Again, running the example experiment for beginners works fine. I guess that 1) The simplest solutions are the best and 2) All hail StackOverflow. Have a good one

nikohansen commented 2 years ago

Cool, to make this work for everybody we probably want to get rid of the universal_newlines then, and this should not have other adverse effects?

nikohansen commented 2 years ago

Without the universal_newlines argument we see an error on the Jenkins linux test, failing to concatenate bytes with str:

PYTHON  setup.py install in code-experiments/build/python
Traceback (most recent call last):
  File "/builds/disk/builds/workspace/Coco-test2-linux/dotype/build-python/label/numbbo-ubuntu-12p04-i386/pyversion/python3/code-experiments/tools/cocoutils.py", line 151, in python
    output = check_output_with_print(verbose, full_command, stderr=STDOUT, env=os.environ)
  File "/builds/disk/builds/workspace/Coco-test2-linux/dotype/build-python/label/numbbo-ubuntu-12p04-i386/pyversion/python3/code-experiments/tools/cocoutils.py", line 43, in check_output_with_print
    output = check_output(*popenargs, **kwargs)
  File "/usr/lib/python3.2/subprocess.py", line 522, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'setup.py', 'install']' returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "do.py", line 1032, in <module>
    main(sys.argv[1:])
  File "do.py", line 1001, in main
    elif cmd == 'build-python': build_python(package_install_option = package_install_option)
  File "do.py", line 327, in build_python
    + package_install_option, custom_exception_handler=install_error)
  File "/builds/disk/builds/workspace/Coco-test2-linux/dotype/build-python/label/numbbo-ubuntu-12p04-i386/pyversion/python3/code-experiments/tools/cocoutils.py", line 159, in python
    exception_handled = custom_exception_handler(e)
  File "do.py", line 245, in install_error
    formatted_message.append("| " + line.ljust(75) + " |")
TypeError: Can't convert 'bytes' object to str implicitly

https://ci.inria.fr/numbbo/job/Coco-test2-linux/dotype=build-python,label=numbbo-ubuntu-12p04-i386,pyversion=python3/539/console