stfc / fparser

This project maintains and develops a Fortran parser called fparser2 written purely in Python which supports Fortran 2003 and some Fortran 2008. A legacy parser fparser1 is also available but is not supported. The parsers were originally part of the f2py project by Pearu Peterson.
https://fparser.readthedocs.io
Other
61 stars 29 forks source link

Parse tree incorrect when literal includes exponent #301

Closed arporter closed 2 years ago

arporter commented 2 years ago

When we have an expression involving a literal with an exponent then we get the following parse tree (fragment):

zw = zw - 3.14807E-10*zw6

      child type =  <class 'fparser.two.Fortran2003.Add_Operand'>
        child type =  <class 'fparser.two.Fortran2003.Level_2_Expr'>
          child type =  <class 'fparser.two.Fortran2003.Name'>
          child type =  <class 'str'> '-'
          child type =  <class 'fparser.two.Fortran2003.Real_Literal_Constant'>
            child type =  <class 'str'> '3.14807E-10'
            child type =  <class 'NoneType'>
        child type =  <class 'str'> '*'
        child type =  <class 'fparser.two.Fortran2003.Name'>

If the exponent is removed then the parse tree has a different structure:

zw = zw - 3.14807*zw6

      child type =  <class 'fparser.two.Fortran2003.Level_2_Expr'>
        child type =  <class 'fparser.two.Fortran2003.Name'>
        child type =  <class 'str'> '-'
        child type =  <class 'fparser.two.Fortran2003.Add_Operand'>
          child type =  <class 'fparser.two.Fortran2003.Real_Literal_Constant'>
            child type =  <class 'str'> '3.14807'
            child type =  <class 'NoneType'>
          child type =  <class 'str'> '*'
          child type =  <class 'fparser.two.Fortran2003.Name'>
arporter commented 2 years ago

image

arporter commented 2 years ago

I suspect this is to do with #288 that was fixed back in January.

arporter commented 2 years ago

"\b([.]\d\d|\d\d[.]\d|\d\d)[ed][+-]?\d*"gm

arporter commented 2 years ago

Better regex is: [^\w](\d*[.])?\d+[ed][+-]?\d+(_\w*)? although this will include the preceding non-'word' char in the match.