SymbolTable errors when parsing snippets with the ParserFactory().create() parser multiple times

stfc / fparser

This project maintains and develops a Fortran parser called fparser2 written purely in Python which supports Fortran 2003 and some Fortran 2008. A legacy parser fparser1 is also available but is not supported. The parsers were originally part of the f2py project by Pearu Peterson.

https://fparser.readthedocs.io

Other

61 stars 29 forks source link

SymbolTable errors when parsing snippets with the ParserFactory().create() parser multiple times #339

Closed sergisiso closed 2 years ago

sergisiso commented 2 years ago

When updating fparser to master in PSyclone I get many fparser.two.symbol_table.SymbolTableError

Looking in to it, parsing any single snippet one does never fail: This:

# Generate PSyIR from Fortran code via fparser2 ast
code = (
    "program test\n"
    "  real :: a\n"
    "  a = 0.0\n"
    "end program test")
psyir = fortran_reader.psyir_from_source(code)

but after parsing the same spinet again or another containing a it fails. e.g. another psyir = fortran_reader.psyir_from_source(code) fails with: fparser.two.symbol_table.SymbolTableError: Symbol table already contains a symbol for a variable with name 'a'

The previous test uses internally:

# parser initialized to ParserFactory().create(std="f2008") only once
string_reader = FortranStringReader(source_code)
parser(code)

I assume the symbol table in not cleaned properly each time parser is called and symbols are accumulated.

arporter commented 2 years ago

fparser tests have a clear_symbol_tables_fixture which is automatically run for every test and wipes any existing state. I think all that is required is to add such functionality to the parser-related fixtures in the PSyclone test suite. I'll open an issue to do this.

sergisiso commented 2 years ago

But is this the expected usage? In non-test code, do we have to clean up the symbol table? e.g. when parsing a module to get imports datatypes or parsing a kernel?

We could clean up in PSyIR psyir_from_source or let the ParserFactory() return an object than first cleans up and then calls the Fortran2003.Program.

Sometimes we call directly a node from PSyclone to parse a single expression e.g. Fortran2003.Execution_Part(string_reader) this should or shoulnd't be clean up depending on the context. What happens with those?

arporter commented 2 years ago

What should happen is that there is a symbol table created for each top-level program unit/module. So, when parsing a whole code base everything is fine because there can be no duplication. You'll get problems though if we repeatedly parse some code that contains the same program unit name (e.g. 'test') as I'm sure we do in lots of places in the test suite.

arporter commented 2 years ago

Oh, and if it's a single expression then there's no symbol table and things should be fine.

arporter commented 2 years ago

We could clean up in PSyIR psyir_from_source or let the ParserFactory() return an object than first cleans up and then calls the Fortran2003.Program.

I think we need to do both - it makes sense that ParserFactory().create() should clean up the global state. The FortranReader constructor only ever calls this routine once and then saves the resulting parser. However, we could extend it so that it does clean-up the state of the existing parser.