stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
103 stars 25 forks source link

Encapsulate fparser dependencies inside the Fortran frontend #1188

Open sergisiso opened 3 years ago

sergisiso commented 3 years ago

The following (or similar) snippet is repeated many times in the code and tests:

processor = Fparser2Reader()
reader = FortranStringReader(code)
parser = ParserFactory().create(std="f2003")
parse_tree = parser(reader)
subroutine = processor.generate_psyir(parse_tree)

I was wondering if it was worth it putting it inside the fparser2 frontend and in fact call it FortranReader. Then the entry point would provide str->PSyIR. The advantages would be:

What I am unsure about this change is that we also have a lot of incomplete code parsing, where instead of using the ParserFactory we use directly an fparser node match (e.g. Execution_Part.match(reader) or Fortran2003.Call_Stmt(reader)). @rupertford @arporter is there a more generic way to do this "partial code" parsing? I guess no because what it is depends on its context?

Still we could come up with an interface that only the matcher has to be provided as a parameter so less fparser dependencies are leaked from the frontend to the code?

hiker commented 3 years ago

That would be great indeed! Note that we already have .../tests/utilities.py, which has functions like get_ast, create_schedule, get_invoke. They are just by far not used everywhere where they could be used.

Could we perhaps just use 'full Fortran' subroutines and parse them instead of just checking the sub-expressions? E.g. atm we have:

    reader = FortranStringReader("x=1")
    fparser2assignment = Execution_Part.match(reader)[0][0]

    fake_parent = Schedule()
    processor = Fparser2Reader()
    processor.process_nodes(fake_parent, [fparser2assignment])
    # Check a new node was generated and connected to parent
    assert len(fake_parent.children) == 1
    new_node = fake_parent.children[0]
    assert isinstance(new_node, Assignment)
    assert len(new_node.children) == 2

This could become something like this:

    reader = FortranStringReader('''program test_prog
                                 real :: x
                                 x = 1''')
    ast = parser(reader)
    psy = PSyFactory(API).create(ast)
    schedule = psy.invokes.get("test_prog").schedule

    # Simple scalar assignment:  a = b
    scalar_assignment = schedule.children[0]
    assert isinstance(scalar_assignment, Assignment)

And all that code would be much shorter with your proposed encapsulation.Admittedly, if we break something, it would be more work to find out the details (though you could assume that the list of tests that fail would give us a good indication).

rupertford commented 3 years ago

I like that suggestion a lot. I've had a quick grep through the code and it looks to me like the only real place where we use parts of the tree, rather than the whole tree, are in the tests, so we could possibly clean up the main PSyclone code and address the tests separately.

sergisiso commented 3 years ago

Creating the fparser object ParserFactory().create(std="f2003") is a expensive operation. Once one is created it will be good to keep it alive (in a class attribute?) and reuse by the multiple FortranReader() that need the same 'std'.

Note this is already done in the tests by the session-scoped f2008_parser fixture.