Closed prehensilecode closed 7 years ago
Hi,
Thanks for your bug report. It was a bit of a challenge to maintain both Python 2 and 3 support manually. We discovered issues such as this one you found while we built out the automated test suite. Now that our Travis build system is complete and automated Python 2&3 builds, we don't expect this bug to surface again.
https://travis-ci.org/deads/paratext
Python 2.x's PyString_XXX
makes no assumptions about the underlying encoding of the data, while PyUnicode_XXX
assumes the data is a sequence of valid codepoints. In Python 2.7, the internal string does not need to be a valid sequence of Unicode codepoints (e.g. u"\uDCFF"
works in Python 2, but not 3). In Python 3.x, I believe no Unicode string can be constructed unless it is sequence of valid codepoints. In the case where the data is loaded from a file and no assumption can be made about its encoding, one must be a bit more delicate. For Python 3, PyBytes_XXX
can be used for this purpose.
Can you repull from master and verify that you are no longer experiencing this issue? Henceforth, all proposed pull requests will have to pass all tests on both Python 2 and Python 3, unless there is a good reason for allowing a test to fail.
Best, Damian
System is RHEL 6.4 with gcc 4.8.1 and SWIG 3.0.10 and Python 3.5.0 installed.
Firstly, python/setup.py is not Python3 clean: it uses print statements rather than print() functions. After running it through 2to3, the build fails with a few occurences of this error:
Modifying src/python/numpy_helper.hpp to replace "PyString_FromStringAndSize()" with "PyUnicode_FromStringAndSize()" makes the build succeed.
Perhaps something like this is required to support both Python2 and Python3.