emblsaxs / saxsview

Load and display data from many SAXS-related data formats
GNU General Public License v3.0
6 stars 1 forks source link

Files in GB 18030 encoding not read from Python #48

Open plmnnk opened 4 years ago

plmnnk commented 4 years ago

https://www.sasbdb.org/media/intensities_files/SASDCH9.dat This file came in GB 18030 encoding (Chinese government standard), the data can be accessed using libsaxsdocument from C code (autorg, primus) but not from Python code (sasbdb). If one converts this file to UTF-8 - it is correctly read by all applications.

chatcannon commented 4 years ago

Unicode handling changed between Python 2 and Python 3. Which version of Python has the bug?

plmnnk commented 4 years ago

On 16.02.20 06:18, Chris Kerr wrote:

Unicode handling changed between Python 2 and Python 3. Which version of Python has the bug?

That's Python 3.

chatcannon commented 4 years ago

Given that the pysaxsdocument.c module still uses the C saxs_document_read function, the cause must be that Python changes some setting which affects the behaviour of the C library functions used by libsaxsdocument.