spacemanspiff2007 / sphinx-exec-code

Run python code in sphinx and display the output
Apache License 2.0
12 stars 1 forks source link

Unicode characters are not supported in code output #10

Open daviddoret opened 1 year ago

daviddoret commented 1 year ago

Unicode strings are not supported by sphinx-exec-code in code output. The compilation of sphinx in pycharm raises a UnicodeEncodeError error and fails to execute.

Here is a minimalist example to reproduce the issue. In my code I extensively use unicode math symbols.

The rst file:

.. exec_code::

print('This is a beautiful unicode character: \u265E.')`

The console output of sphinx compilation:

Running Sphinx v7.1.2 checking bibtex cache... out of date parsing bibtex file C:\REDACTED.bib... parsed 8 entries building [mo]: targets for 0 po files that are out of date writing output... building [html]: targets for 34 source files that are out of date updating environment: [new config] 34 added, 0 changed, 0 removed ERROR: print('This is a beautiful unicode character: \u265E.') <-- ERROR: ERROR: Traceback (most recent call last): ERROR: File "test.rst", line 6 ERROR: File "C:\Users\REDACTED\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode ERROR: return codecs.charmap_encode(input,self.errors,encoding_table)[0] ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR: UnicodeEncodeError: 'charmap' codec can't encode character '\u265e' in position 39: character maps to <undefined> Extension error: Could not execute code! Process finished with exit code 0

By the way, I love sphinx-exec-code, thank you for this excellent sphinx extension!

spacemanspiff2007 commented 1 year ago

Does manually setting exec_code_set_utf8_encoding from the configuration changes anything?

daviddoret commented 1 year ago

No, it does not. I should have mentioned that I tried this as well.

Initially I thought that the issue was related to the dependency of sphinx-exec-code with pygments. But then I used pygments directly and it processes all unicode characters without problem.

spacemanspiff2007 commented 1 year ago

Which OS are you on? Win7, 10 or 11? The python sub process correctly returns the unicode values but not all shells support unicode characters. Maybe it's an issue somewhere there.

daviddoret commented 1 year ago

Microsoft Windows 11 Professionnel, Version 10.0.22621 Build 22621.

spacemanspiff2007 commented 1 year ago

Hmm - unfortunately I can not reproduce the issue - I even have tests in place that ensure that unicode works under windows.

What happens when you try a build outside of PyCharm? Have you set the PyCharm console to utf-8?

daviddoret commented 1 year ago

Yes, my current project output a lot of Unicode in the Pycharm console and it works smoothly. I think the problem is not Unicode per se but a subset of Unicode characters. Good idea, I should have a try outside Pycharm to see how it goes. I apologize as I am under time constraint and must postpone this test. My current workaround is to not use the Sphinx extension, generate all code outputs in files with a batch, and include them in the Sphinx documentation with a file inclusion directive. Thank you anyway for your much appreciated help.

spacemanspiff2007 commented 1 year ago

I specially tried your example and it worked fine for me.

This is how I execute the python code. From that you can create a minimal example where you can reproduce the issue. Maybe you can play a little bit around with the subprocess.run arguments and find a fix?