stfc / fparser

This project maintains and develops a Fortran parser called fparser2 written purely in Python which supports Fortran 2003 and some Fortran 2008. A legacy parser fparser1 is also available but is not supported. The parsers were originally part of the f2py project by Pearu Peterson.
https://fparser.readthedocs.io
Other
61 stars 29 forks source link

Parsing of comment-only files #375

Closed reuterbal closed 1 year ago

reuterbal commented 1 year ago

Hi,

fparser2 is currently unable to parse comment-only files. I appreciate that it's debatable whether it should, however, I have one admittedly very specific use-case where I may end up splitting source-files into pieces before parsing them individually (each of those pieces valid and complete fortran program units -- or only comments). It's easy enough to work around this, of course, but I was wondering if you consider this to be something that should be working in general?

from pathlib import Path
fcode = """
! Some file
! that has only comments
! for whatever reason
""".strip()
Path('/tmp/myfile.F90').write_text(fcode)
from fparser.two.parser import ParserFactory
from fparser.common.readfortran import FortranFileReader
f2008_parser = ParserFactory().create(std="f2008")

reader = FortranFileReader("/tmp/myfile.F90", ignore_comments=False)
parse_tree = f2008_parser(reader)
print(parse_tree)
---------------------------------------------------------------------------
NoMatchError                              Traceback (most recent call last)
File ~/loki/nabr-lightweight-sourcefiles/loki_env/lib/python3.8/site-packages/fparser/two/Fortran2003.py:237, in Program.__new__(cls, string)
    236 try:
--> 237     return Base.__new__(cls, string)
    238 except NoMatchError:
    239     # At the moment there is no useful information provided by
    240     # NoMatchError so we pass on an empty string.

File ~/loki/nabr-lightweight-sourcefiles/loki_env/lib/python3.8/site-packages/fparser/two/utils.py:428, in Base.__new__(cls, string, parent_cls)
    427     errmsg = f"{cls.__name__}: '{string}'"
--> 428 raise NoMatchError(errmsg)

NoMatchError: at line 3
>>>! for whatever reason

During handling of the above exception, another exception occurred:

FortranSyntaxError                        Traceback (most recent call last)
Untitled-1.ipynb Cell 3 in <cell line: 2>()
      <a href='vscode-notebook-cell:Untitled-1.ipynb?jupyter-notebook#W1sdW50aXRsZWQ%3D?line=0'>1</a> reader = FortranFileReader("/tmp/myfile.F90", ignore_comments=False)
----> <a href='vscode-notebook-cell:Untitled-1.ipynb?jupyter-notebook#W1sdW50aXRsZWQ%3D?line=1'>2</a> parse_tree = f2008_parser(reader)
      <a href='vscode-notebook-cell:Untitled-1.ipynb?jupyter-notebook#W1sdW50aXRsZWQ%3D?line=2'>3</a> print(parse_tree)

File ~/loki/nabr-lightweight-sourcefiles/loki_env/lib/python3.8/site-packages/fparser/two/Fortran2003.py:241, in Program.__new__(cls, string)
    237     return Base.__new__(cls, string)
    238 except NoMatchError:
    239     # At the moment there is no useful information provided by
    240     # NoMatchError so we pass on an empty string.
--> 241     raise FortranSyntaxError(string, "")
    242 except InternalSyntaxError as excinfo:
    243     # InternalSyntaxError is used when a syntax error has been
    244     # found in a rule that does not have access to the reader
    245     # object. This is then re-raised here as a
    246     # FortranSyntaxError, adding the reader object (which
    247     # provides line number information).
    248     raise FortranSyntaxError(string, excinfo)

FortranSyntaxError: at line 3
>>>! for whatever reason

Just to demonstrate that this parses fine once I add some Fortran afterwards:

fcode += """
subroutine main
end
""".rstrip()
Path('/tmp/myotherfile.F90').write_text(fcode)
reader = FortranFileReader("/tmp/myotherfile.F90", ignore_comments=False)
parse_tree = f2008_parser(reader)
print(parse_tree)

! Some file
! that has only comments
! for whatever reason
SUBROUTINE main
END

FWIW: Out of curiosity I had a look at what compilers do with this: gfortran, ifort, ifx parse this without problems, nvfortran is the only one at least printing a warning stating that this file is empty. All generate .o files, though.

rupertford commented 1 year ago

I think it is fair to say that we should be able to parse this successfully.

rupertford commented 1 year ago

Created branch https://github.com/stfc/fparser/tree/375_comment_only_files

reuterbal commented 1 year ago

Fantastic, thanks a lot!