de-parse of CASE block containing non-ascii char fails under Python 2 - Githubissues

stfc / fparser

This project maintains and develops a Fortran parser called fparser2 written purely in Python which supports Fortran 2003 and some Fortran 2008. A legacy parser fparser1 is also available but is not supported. The parsers were originally part of the f2py project by Pearu Peterson.

https://fparser.readthedocs.io

Other

62 stars 29 forks source link

de-parse of CASE block containing non-ascii char fails under Python 2 #226

Closed arporter closed 4 years ago

arporter commented 4 years ago

Again, the non-ASCII char is in a string:

     CASE(  30  )       !==  fixed 3D shape  ==!
        IF(lwp) WRITE(numout,*) '   ==>>>   eddy viscosity = F( latitude, longitude, depth )'
        IF(lwp) WRITE(numout,*) '           maximum reachable coefficient (at the Equator) = ', zah_max, cl_Units, '  for e1=1°)'

arporter commented 4 years ago

This case fails at line 5571 of Fortran2003.py:

def tofortran(self, tab='', isfix=None):
    tmp = []
    start = self.content[0]
    end = self.content[-1]
    tmp.append(start.tofortran(tab=tab, isfix=isfix))
    for item in self.content[1:-1]:
        if isinstance(item, Case_Stmt):
            tmp.append(item.tofortran(tab=tab, isfix=isfix))
        else:
            tmp.append(item.tofortran(tab=tab + '  ', isfix=isfix))
    tmp.append(end.tofortran(tab=tab, isfix=isfix))
    return '\n'.join(tmp)

arporter commented 4 years ago

This is similar to the problem fixed in #217

arporter commented 4 years ago

BlockBase.tofortran() has the same problem where it does a "\n".join(tmp). (Line 653 of utils.py.)

arporter commented 4 years ago

Strangely, I can only reproduce these errors in a pytest test when I use the FortranFileReader with ignore_comments=False. If I set it to True then all is fine.

arporter commented 4 years ago

With comments, the list of strings being joined contains some (the comments) that are unicode while the code itself is just str. Without comments, the list of things being joined does not contain any unicode strings and thus (presumably) no conversion is attempted.

arporter commented 4 years ago

The original problem is exercised if there is a comment *inside" a select case block. The problem in BlockBase is exercised if there is a comment just in the program body. I've added two tests to illustrate these.

arporter commented 4 years ago

I started down the road of making sure everything was unicode but that got rapidly out of hand and, theoretically, Python 2 end-of-life is only a month away. I've therefore gone for the minimal fix of ensuring that all strings are consistently encoded before we do a join on them (when producing Fortran).

rupertford commented 4 years ago

PR #227 has been merged to master. Closing this issue.