ecmwf / fckit

A Fortran toolkit for interoperating Fortran with C/C++
https://confluence.ecmwf.int/display/fckit
Apache License 2.0
29 stars 15 forks source link

Unicode errors from fckit-fypp.py #14

Closed DJDavies2 closed 3 years ago

DJDavies2 commented 3 years ago

fckit-fypp.py was failing for me on some files with tracebacks like this:

Traceback (most recent call last): File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 3004, in run_fypp() File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 2838, in run_fypp tool.process_file(infile, outfile) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 2514, in process_file output = self._preprocessor.process_file(infile) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 2393, in process_file self._parser.parsefile(fname) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 253, in parsefile self._includefile(None, inpfp, fobj, os.path.dirname(fobj)) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 267, in _includefile self._parse_txt(span, fname, fobj.read()) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 580, in _parse_txt self._parse(txt) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 616, in _parse self._process_control_dir(content, span) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 679, in _process_control_dir self._process_include(param, span) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 829, in _process_include self._includefile(span, inpfp, fpath, os.path.dirname(fpath)) File "/home/h01/frwd/cylc-run/UnicodeFailure/share/lfric-bundle/fckit/tools/fckit-fypp.py", line 267, in _includefile self._parse_txt(span, fname, fobj.read()) File "/net/project/ukmo/scitools/opt_scitools/environments/default/2019_02_27/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 242: ordinal not in range(128)

This can be fixed by setting environment variables

LANG=en_US.utf-8 LC_ALL=en_US.utf-8

However it would be nice if this would work without having to set environment variables; one possibility for achieving that might be to add 'encoding="utf-8"' to the open arguments on lines 2862 and 2881. What do you think?

wdeconinck commented 3 years ago

Hi @DJDavies2 , this script is a symbolic link to a contributed fypp from 2019. It seems that fypp developments have progressed a little bit, and by the looks of it, it should support default utf-8 encoding now. I have updated the contributed fypp to version 3.0 (latest release, dated January 2020) in the develop branch. Please check if that solves it as I have no test for it. You're also welcome to create a PR with a test for it. If it does not work for you, then we can try to patch this here locally, but really it should be taken this up with the fypp Github project: https://github.com/aradi/fypp

DJDavies2 commented 3 years ago

Thanks, I think the fypp change is https://github.com/aradi/fypp/commit/b6a9e34f52b9888571b1c8f7d4837049755a4f7b#diff-35017d4d193ac76c75e30c2900f091640e0e5cc6ed8fa7dc68f2cb08187b0be3, that seems to have a bunch of encoding changes in. I tested the updated fckit develop and it worked for me so I will close this.

wdeconinck commented 3 years ago

Thanks for checking. I had to add another commit as there was a problem installing the fckit-fypp.py script