brainvisa / axon

Brainvisa main GUI
Other
0 stars 1 forks source link

add encoding for open in python3 in documentation generator #41

Closed Hboni closed 4 years ago

Hboni commented 4 years ago

The process generateDocumentation in the Tools toolbox doesn't seem to work in python3 for me.

The first error is :

raceback (most recent call last):
  File "/casa/build/python/brainvisa/processes.py", line 3380, in _processExecution
    result = process.execution(self)
  File "/casa/build/brainvisa/toolboxes/tools/processes/documentation/generateDocumentation.py", line 427, in execution
    write_category_html=self.write_category_html)
  File "/casa/build/brainvisa/toolboxes/tools/processes/documentation/generateDocumentation.py", line 277, in generateHTMLProcessesDocumentation
    translators[l] = neuroConfig.Translator(l)
  File "/casa/build/python/brainvisa/configuration/neuroConfig.py", line 952, in __init__
    self.translations = self.getTranslations()
  File "/casa/build/python/brainvisa/configuration/neuroConfig.py", line 965, in getTranslations
    translations.update(readMinf(file)[0])
  File "/casa/build/python/soma/minf/api.py", line 299, in readMinf
    exceptions=exceptions))
  File "/casa/build/python/soma/minf/api.py", line 222, in iterateMinf
    start = source.read(5)
  File "/casa/build/python/soma/bufferandfile.py", line 119, in read
    result = self.__buffer + self.__file.read(size - buffer_size)
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 291: ordinal not in range(128)

To fix it, choosing 'utf-8' as encoding parameter when translation files are read fix it. Then another bug appears for nearly every file :

Traceback (most recent call last):
  File "/casa/build/brainvisa/toolboxes/tools/processes/documentation/generateDocumentation.py", line 287, in generateHTMLProcessesDocumentation
    generateHTMLDocumentation(pi, translators, context, ontology)
  File "/casa/build/brainvisa/toolboxes/tools/processes/documentation/generateDocumentation.py", line 189, in generateHTMLDocumentation
    print('<h2>' + tr.translate('Parameters') + '</h2>', file=f)
UnicodeEncodeError: 'ascii' codec can't encode character '\xe8' in position 9: ordinal not in range(128)

As before, adding encoding parameter when open html file fixes it. To keep compatibility with python 2, I added a check of the python version to only use the encoding parameter in python3.

During my search, I also found that between my BV python3 and BV python2, when I try locale.getpreferredencoding(), in python2 I get UTF-8 and in python3 I get ANSI_X3.4-1968. I was wondering if it is better to change the encoding in the code (as I did), or if we can change something in the distro to change the preferred encoding value?

ylep commented 4 years ago

During my search, I also found that between my BV python3 and BV python2, when I try locale.getpreferredencoding(), in python2 I get UTF-8 and in python3 I get ANSI_X3.4-1968. I was wondering if it is better to change the encoding in the code (as I did), or if we can change something in the distro to change the preferred encoding value?

The question is, what environment are you using to run your Python 3 version? You should check that the LANG environment variable is set. We do it in base images of casa-distro containers, e.g.:

https://github.com/brainvisa/casa-distro/blob/12ccfee5993dd507e5c90ea6639a15b2e5c34541/share/docker/casa-test/ubuntu-16.04/Dockerfile#L5-L6

Hboni commented 4 years ago

The question is, what environment are you using to run your Python 3 version? You should check that the LANG environment variable is set. We do it in base images of casa-distro containers, e.g.:

https://github.com/brainvisa/casa-distro/blob/12ccfee5993dd507e5c90ea6639a15b2e5c34541/share/docker/casa-test/ubuntu-16.04/Dockerfile#L5-L6

My python3 setup is in ubuntu18, and the LANG is only C, the link only show Dockerfile for Ubuntu16, but I don't see the same option in the Ubuntu18 Dockerfile.

Hboni commented 4 years ago

If at all possible, it would be better to avoid an explicit check of the Python version, and use the same code under Python 2 and Python 3. I would suggest to try io.open, which is the open function of Python 3 backported to Python 2. io.open(filePath, 'r', encoding='utf-8') should work under both versions.

I agree to avoid python version check. io.open() seems to work, I just need to refactor the generateDocumentation process, to work in python2 (string/unicode problems)

ylep commented 4 years ago

I don't see the same option in the Ubuntu18 Dockerfile.

Indeed, it looks like I forgot Ubuntu 18.04 :-/ Fixing it now https://github.com/brainvisa/casa-distro/pull/97

Hboni commented 4 years ago

Maybe this option fixes the initial issue, and so my modifications are not useful anymore, with this ubuntu18 casa-distro :stuck_out_tongue:

ylep commented 4 years ago

Maybe this option fixes the initial issue, and so my modifications are not useful anymore, with this ubuntu18 casa-distro stuck_out_tongue

Maybe. You can try it by just setting the LANG environment variable in your container (export LANG=C.UTF-8). Still, it is good to have code that does not depend on a particular environment to run correctly :-)

denisri commented 4 years ago

I agree that we should not depend on the LANG variable, which is a user setting, whereas we require UTF-8 for internal needs of brainvisa. Using io.open() seems OK for me.

Hboni commented 4 years ago

Changing the LANG to C.UTF-8 initially solve the problem. I pushed some modifications to use io.open and fixes documentation generation.