Closed dimpase closed 4 years ago
perhaps Python 3.6 needs an extra punch in form of
--- a/src/sage/docs/conf.py
+++ b/src/sage/docs/conf.py
@@ -36,6 +36,7 @@ plot_pre_code = """
# Set locale to prevent having commas in decimal numbers
# in tachyon input (see https://github.com/sagemath/sage-prod/issues/28971)
import locale
+locale.setlocale(LC_ALL, '')
locale.setlocale(locale.LC_NUMERIC, 'C')
def sphinx_plot(graphics, **kwds):
import matplotlib.image as mpimg
could you try this, without what's suggested in comment:51 ?
There is a comment
According to POSIX, a program which has not called setlocale(LC_ALL, '') runs using the
portable 'C' locale. Calling setlocale(LC_ALL, '') lets it use the default locale as defined by the LANG variable. Since we do not want to interfere with the current locale setting we thus emulate the behavior in the way described above.
in https://docs.python.org/3.6/library/locale.html which makes me think it might be what's needed.
The name C.UTF-8
isn't sufficient; you have to check for the others too. On Gentoo:
$ locale -a
C
C.utf8
POSIX
en_US
en_US.iso88591
en_US.utf8
Replying to @dimpase:
could you try this, without what's suggested in comment:51 ?
I get the same error after make doc-clean && make. In fact adding a line print('AAAAAAAAAAAAA')
a this place does not print any AAAAAAAAA...
on my screen, so I wonder if that code is really executed before the error occurs.
AAAAA, indeed (headbang on kbd) here is the fix, I think
--- a/src/doc/en/reference/conf_sub.py
+++ b/src/doc/en/reference/conf_sub.py
@@ -20,7 +20,7 @@ ref_src = os.path.join(SAGE_DOC_SRC, 'en', 'reference')
ref_out = os.path.join(SAGE_DOC, 'html', 'en', 'reference')
# We use the main document's title, if we can find it.
-rst_file = open('index.rst', 'r')
+rst_file = open('index.rst', 'r', encoding='utf-8')
rst_lines = rst_file.read().splitlines()
rst_file.close()
indeed, that's one place in our homebaked docbuilder that opens UTF-8-encoded files without specifying them as UTF, and Python 3.6 does not like it.
there are also many more missing encoding='utf-8'
in open()
on source files in the docbuilder.
and perhaps on other text files too. all these lead to these UnicodeDecodeError: 'ascii' codec can't decode byte...
Should we just abandon the idea to support random system Python 3.6, they are just broken in this way...
Worse, there are files with lots of utf-8 stuff, sitting in r"""..."""
multiline comments, e.g.
src/sage/algebras/lie_algebras/lie_algebra_element.pyx
, and they cause problems too.
this is with Ubuntu 18.04 native Python 3.6:
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.complete_primary_decomposition: 'ascii' codec can't decode byte 0xc2 in position 4774: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.genus: 'ascii' codec can't decode byte 0xc2 in position 4774: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.primary_decomposition_complete: 'ascii' codec can't decode byte 0xc2 in position 4774: ordinal not in range(128)
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal.degree_of_semi_regularity:8: WARNING: Inline interpreted text or phrase reference start-string without end-string.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal.degree_of_semi_regularity:9: WARNING: Block quote ends without a blank line; unexpected unindent.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal.degree_of_semi_regularity:23: WARNING: Inline interpreted text or phrase reference start-string without end-string.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal.degree_of_semi_regularity:24: WARNING: Block quote ends without a blank line; unexpected unindent.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.associated_primes:4: WARNING: Inline interpreted text or phrase reference start-string without end-string.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.associated_primes:7: WARNING: Block quote ends without a blank line; unexpected unindent.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.primary_decomposition:4: WARNING: Inline interpreted text or phrase reference start-string without end-string.
[dochtml] [polynomia] /home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage/rings/polynomial/multi_polynomial_ideal.py:docstring of sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.primary_decomposition:6: WARNING: Block quote ends without a blank line; unexpected unindent.
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.hamming_weight: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.inverse_series_trunc: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.is_one: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.is_term: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.is_zero: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.list: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.number_of_terms: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial.truncate: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial_generic_dense.is_term: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial_generic_dense.list: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial_generic_dense.truncate: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.generic_power_trunc: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] WARNING: error while formatting arguments for sage.rings.polynomial.polynomial_element.Polynomial._mul_trunc_: 'ascii' codec can't decode byte 0xc3 in position 8166: ordinal not in range(128)
[dochtml] [polynomia] The inventory files are in local/share/doc/sage/inventory/en/reference/polynomial_rings.
[dochtml] Error building the documentation.
[dochtml] Traceback (most recent call last):
[dochtml] File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
[dochtml] "__main__", mod_spec)
[dochtml] File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
[dochtml] exec(code, run_globals)
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__main__.py", line 2, in <module>
[dochtml] main()
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 1730, in main
[dochtml] builder()
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 344, in _wrapper
[dochtml] getattr(get_builder(document), 'inventory')(*args, **kwds)
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 570, in _wrapper
[dochtml] self._build_everything_except_bibliography(lang, format, *args, **kwds)
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 556, in _build_everything_except_bibliography
[dochtml] build_many(build_ref_doc, non_references)
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 296, in build_many
[dochtml] _build_many(target, args, processes=NUM_THREADS)
[dochtml] File "/home/dima/sage/dev/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/utils.py", line 291, in build_many
[dochtml] raise worker_exc.original_exception
[dochtml] OSError: WARNING: error while formatting arguments for sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.complete_primary_decomposition: 'ascii' codec can't decode byte 0xc2 in position 4774: ordinal not in range(128)
if anyone wants to sort all this out, be my guest.
please push your branch if you started to do some changes using utf-8 encodings. I will take a look tomorrow europe time.
Because build_many
catches the error, it is difficult to debug. The following change allows to see the real error:
diff --git a/src/sage_setup/docbuild/__init__.py b/src/sage_setup/docbuild/__init__.py
index 0841f429e7..c9e08214f3 100644
--- a/src/sage_setup/docbuild/__init__.py
+++ b/src/sage_setup/docbuild/__init__.py
@@ -292,6 +292,8 @@ def build_many(target, args):
Thin wrapper around `sage_setup.docbuild.utils.build_many` which uses the
docbuild settings ``NUM_THREADS`` and ``ABORT_ON_ERROR``.
"""
+ for arg in args:
+ target(arg)
try:
_build_many(target, args, processes=NUM_THREADS)
except BaseException as exc:
which is:
[dochtml] File "/home/slabbe/GitBox/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 296, in build_many
[dochtml] target(arg)
[dochtml] File "/home/slabbe/GitBox/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 79, in build_ref_doc
[dochtml] getattr(ReferenceSubBuilder(doc, lang), format)(*args, **kwds)
[dochtml] File "/home/slabbe/GitBox/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 763, in _wrapper
[dochtml] for module_name in self.get_all_included_modules():
[dochtml] File "/home/slabbe/GitBox/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 927, in get_all_included_modules
[dochtml] for module in self.get_modules(filename):
[dochtml] File "/home/slabbe/GitBox/sage/local/lib/python3.6/site-packages/sage_setup/docbuild/__init__.py", line 1009, in get_modules
[dochtml] lines = f.readlines()
[dochtml] File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
[dochtml] return codecs.ascii_decode(input, self.errors)[0]
[dochtml] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2661: ordinal not in range(128)
But indeed, I see that adding a encoding='utf-8'
at this place is not sufficient.
It is not sifficient, and I get the codec errors on a file where I cleaned all non-ascii away (sic!).
I am giving up on this. I think the code is too new for Python 3.6, and I don't think we should invest time in a version that will be EOL by the end of 2021.
People who only have Python 3.6 can build Python.
We already have some code in Sage that was taken from Python 3.7, in src/sage/misc/sageinspect.py, which is used in docbuilding, perhaps it is too new, e.g.
def formatannotation(annotation, base_module=None):
"""
This is taken from Python 3.7's inspect.py; the only change is to
add documentation.
...
code from sageinspect.py
is used in "formatting arguments", the error shown in comment:60.
I think I will surrender too. I posted the changes that needs to be done on the branch u/slabbe/30053 to get Python3.6 to work. As you, I now get the error
[dochtml] OSError: WARNING: error while formatting arguments for
sage.rings.polynomial.multi_polynomial_ideal.MPolynomialIdeal_singular_repr.complete_primary_decomposition:
'ascii' codec can't decode byte 0xc2 in position 4774: ordinal not in range(128)
this ticket helps if one is on python 3.x for x>6. As to 3.6, well, it's apparently hard, as Sage code and perhaps sphinx are too new for it.
Dependencies: #30551
On python >= 3.7, LC_ALL=C
is sufficient, because python itself will attempt to find a UTF-8 version of that locale.
(It may be only a semantic difference, but then the non-existence of C.UTF-8
on some systems becomes Not Our Problem.)
this patch sets LC_ALL to a meaningful value. Do you want it always to be C
?
Replying to @dimpase:
this patch sets LC_ALL to a meaningful value. Do you want it always to be
C
?
Yes, if we're going to throw python-3.6 under the bus in this release.
LC_ALL=C
is the standard choice, and python >= 3.7 already treat LC_ALL=C
magically. That's what "PEP 538 -- Coercing the legacy C locale to a UTF-8 based locale" is about.
Description changed:
---
+++
@@ -3,14 +3,6 @@
There are also some other locale problems that show up in doctests
(for example https://groups.google.com/d/msg/sage-release/spalYgXKr-4/ZVsbgHIlAgAJ)
-
-And a failure building the documentation on `ubuntu-bionic-standard` (using `/usr/bin/python3.6`, https://github.com/mkoeppe/sage/runs/1106251169):
-
-```
- [dochtml] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 2661: ordinal not in range(128)
- [dochtml] Full log file: logs/dochtml.log
-Makefile:1876: recipe for target 'doc-html' failed
-```
Let's refocus this ticket (blocker for Sage 9.2) on solving the issues for Python 3.7+
To me Python 3.6 support is a "wishlist" item, but not a blocker. I have created a separate ticket for it: #30576
what are the 3.7+ problems left here? I presume all of them have plain C locale, so it all should work, no?
Replying to @dimpase:
what are the 3.7+ problems left here? I presume all of them have plain C locale, so it all should work, no?
None, but AFAIK there were none before we started messing with the locale, either.
If no one is going to fix the python-3.6 problems, we should just drop it from the list of supported system pythons and revert this to LC_ALL=C
. That's simpler and leaves us with one fewer locale problem to debug in the future.
I wouldn't close the door on python 3.6 yet, so let's get this patch in please.
Replying to @mkoeppe:
I wouldn't close the door on python 3.6 yet, so let's get this patch in please.
It still isn't sufficient for python-3.6, even if someone does fix the other problems. The name C.UTF-8
isn't standard because... it's not standardized yet. On Gentoo, it's C.utf8
, and elsewhere apparently (per the PEP) it's called simply UTF-8
. All of them should be checked/tried, and we should really only be doing it if python-3.6 is in use. That way, when we drop python-3.6, we're left with just LC_ALL=C
again.
Also keep in mind that if no one actually fixes the other problems with python-3.6, the door is closed regardless.
there is no harm in adding this, anyway, thus, over to bots.
Reviewer: Matthias Koeppe, Dima Pasechnik
Changed dependencies from #30551 to none
I've also removed #30551 as a dependency - after all whether Python 3.6 is (somewhat) supported (IMHO docbuilding is the main problem there) can be sorted out later.
Replying to @mkoeppe:
Yes, I know, that's why I opened the separate ticket #30576 for that.
The stuff on ticket #30576 isn't what I was referring to...
Replying to @dimpase:
there is no harm in adding this, anyway, thus, over to bots.
The harm is that now we have additional locale problems to debug for a change that doesn't solve anything. With python-3.7, there are now three different configurations:
C.UTF-8
locale exists and is used.C
is used.C
on a machine that could have a utf8 locale.Why, when LC_ALL=C
unconditionally works fine and had been in there for ages?
If someone wants to fix python-3.6, we should add these hacks only for python-3.6, and we should try the alternate names C.utf8
and UTF-8
, too.
Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:
ff6f4a6 | assume Python 3.7 or better |
OK, so how about this?
I think that's the best solution, possibly combined with dropping 3.6 from the list of system pythons that we accept. Portability is nice and all, but we're wasting a lot of time trying to support an evolutionary dead-end with py3.6.
Fine with me - then please also adjust python3's spkg-configure...
Branch pushed to git repo; I updated commit sha1. New commits:
64c3a8a | only test python3, to be 3.7 or 3.8 |
Replying to @mkoeppe:
Fine with me - then please also adjust python3's spkg-configure...
fixed, as discussed - only test python3
, to be either 3.7 or 3.8.
Please review
Changed branch from u/dimpase/build/careful_with_C_UTF8 to u/mkoeppe/build/careful_with_C_UTF8
I have made an adjustment. I don't like mixing the change discussed in #30546 into this ticket
New commits:
be47518 | build/pkgs/python3/spkg-configure.m4: Use 'python >= 3.7'; keep checking for 'python3.8, python3.7' |
Rationale: The changed search behavior would lead to confusing reports on sage-release.
Changed author from Dima Pasechnik to Dima Pasechnik, Matthias Koeppe
Description changed:
---
+++
@@ -1,11 +1,5 @@
In #29033 in `build/bin/sage-spkg` LC_ALL was changed to C.UTF-8
However, not all systems have it.
-
-There are also some other locale problems that show up in doctests
-(for example https://groups.google.com/d/msg/sage-release/spalYgXKr-4/ZVsbgHIlAgAJ)
-
-
-
See also:
Description changed:
---
+++
@@ -4,3 +4,9 @@
See also:
- #22659
+
+Follow-up:
+- #30586 macOS: Doctest failures in some locales
+- #30576 Python 3.6: Fix locale/encoding issues, then re-enable Python 3.6
+
+
Description changed:
---
+++
@@ -1,6 +1,19 @@
In #29033 in `build/bin/sage-spkg` LC_ALL was changed to C.UTF-8
However, not all systems have it.
+There are a few more places that explicitly set locale variables (`LANG` and `LC_*`):
+
+```
+build/bin/sage-site: export LANG=C # to ensure it is possible to scrape out non-EN locale warnings
+build/bin/sage-spkg: export LC_ALL=C;
+build/pkgs/gap/spkg-check.in:export LC_CTYPE=en_US.UTF-8
+build/pkgs/pari/spkg-install.in:LANG=C
+build/tox.ini: LC_ALL = C
+docker/Dockerfile:ENV LC_ALL C.UTF-8
+src/sage/docs/conf.py:locale.setlocale(locale.LC_NUMERIC, 'C')
+src/sage/interfaces/gap.py: sage: gap = Gap(env={'LC_CTYPE': 'en_US.UTF-8'})
+src/sage/interfaces/jmoldata.py: env['LC_ALL'] = 'C'
+```
See also:
- #22659
Description changed:
---
+++
@@ -1,19 +1,5 @@
In #29033 in `build/bin/sage-spkg` LC_ALL was changed to C.UTF-8
However, not all systems have it.
-
-There are a few more places that explicitly set locale variables (`LANG` and `LC_*`):
-
-```
-build/bin/sage-site: export LANG=C # to ensure it is possible to scrape out non-EN locale warnings
-build/bin/sage-spkg: export LC_ALL=C;
-build/pkgs/gap/spkg-check.in:export LC_CTYPE=en_US.UTF-8
-build/pkgs/pari/spkg-install.in:LANG=C
-build/tox.ini: LC_ALL = C
-docker/Dockerfile:ENV LC_ALL C.UTF-8
-src/sage/docs/conf.py:locale.setlocale(locale.LC_NUMERIC, 'C')
-src/sage/interfaces/gap.py: sage: gap = Gap(env={'LC_CTYPE': 'en_US.UTF-8'})
-src/sage/interfaces/jmoldata.py: env['LC_ALL'] = 'C'
-```
See also:
- #22659
Changed branch from u/mkoeppe/build/careful_with_C_UTF8 to be47518
Description changed:
---
+++
@@ -5,6 +5,7 @@
- #22659
Follow-up:
+- #30008 Fix sage-system-python
- #30586 macOS: Doctest failures in some locales
- #30576 Python 3.6: Fix locale/encoding issues, then re-enable Python 3.6
In #29033 in
build/bin/sage-spkg
LC_ALL was changed to C.UTF-8 However, not all systems have it.See also:
22659
Follow-up:
30008 Fix sage-system-python
30586 macOS: Doctest failures in some locales
30576 Python 3.6: Fix locale/encoding issues, then re-enable Python 3.6
CC: @antonio-rojas @mkoeppe @slel @orlitzky @kiwifb @embray @fchapoton
Component: build
Author: Dima Pasechnik, Matthias Koeppe
Branch:
be47518
Reviewer: Matthias Koeppe, Dima Pasechnik
Issue created by migration from https://trac.sagemath.org/ticket/30053