Encapsulate fparser dependencies inside the Fortran frontend

sergisiso commented 3 years ago

The following (or similar) snippet is repeated many times in the code and tests:

processor = Fparser2Reader()
reader = FortranStringReader(code)
parser = ParserFactory().create(std="f2003")
parse_tree = parser(reader)
subroutine = processor.generate_psyir(parse_tree)

I was wondering if it was worth it putting it inside the fparser2 frontend and in fact call it FortranReader. Then the entry point would provide str->PSyIR. The advantages would be:

Avoid replication, shorter code
Don't spread fparser dependencies all over the code and have them encapsulated in the frontend. (this is not yet completely possible because fparser is used for other things f2pygen, metadata parsing... But is an step in this direction)
The frontend user shouldn't care about the details of how the reader is implemented so FortranReader seems a better name, and is consistent with the backend names.

What I am unsure about this change is that we also have a lot of incomplete code parsing, where instead of using the ParserFactory we use directly an fparser node match (e.g. Execution_Part.match(reader) or Fortran2003.Call_Stmt(reader)). @rupertford @arporter is there a more generic way to do this "partial code" parsing? I guess no because what it is depends on its context?

Still we could come up with an interface that only the matcher has to be provided as a parameter so less fparser dependencies are leaked from the frontend to the code?

hiker commented 3 years ago

That would be great indeed! Note that we already have .../tests/utilities.py, which has functions like get_ast, create_schedule, get_invoke. They are just by far not used everywhere where they could be used.

Could we perhaps just use 'full Fortran' subroutines and parse them instead of just checking the sub-expressions? E.g. atm we have:

    reader = FortranStringReader("x=1")
    fparser2assignment = Execution_Part.match(reader)[0][0]

    fake_parent = Schedule()
    processor = Fparser2Reader()
    processor.process_nodes(fake_parent, [fparser2assignment])
    # Check a new node was generated and connected to parent
    assert len(fake_parent.children) == 1
    new_node = fake_parent.children[0]
    assert isinstance(new_node, Assignment)
    assert len(new_node.children) == 2

This could become something like this:

    reader = FortranStringReader('''program test_prog
                                 real :: x
                                 x = 1''')
    ast = parser(reader)
    psy = PSyFactory(API).create(ast)
    schedule = psy.invokes.get("test_prog").schedule

    # Simple scalar assignment:  a = b
    scalar_assignment = schedule.children[0]
    assert isinstance(scalar_assignment, Assignment)

And all that code would be much shorter with your proposed encapsulation.Admittedly, if we break something, it would be more work to find out the details (though you could assume that the list of tests that fail would give us a good indication).

rupertford commented 3 years ago

I like that suggestion a lot. I've had a quick grep through the code and it looks to me like the only real place where we use parts of the tree, rather than the whole tree, are in the tests, so we could possibly clean up the main PSyclone code and address the tests separately.

sergisiso commented 3 years ago

Creating the fparser object ParserFactory().create(std="f2003") is a expensive operation. Once one is created it will be good to keep it alive (in a class attribute?) and reuse by the multiple FortranReader() that need the same 'std'.

Note this is already done in the tests by the session-scoped f2008_parser fixture.

stfc / PSyclone

Encapsulate fparser dependencies inside the Fortran frontend #1188