stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
104 stars 28 forks source link

[PSyIR] Support for Fortran files without modules? #2201

Open arporter opened 1 year ago

arporter commented 1 year ago

In processing Socrates, @LonelyCat124 has been encountering source files that just contain program units (subroutines) without a module. At the minute, PSyclone does not create/populate a SymbolTable for a FileContainer - this is explicitly rejected by the Fortran backend.

We need to consider how we want to support 'bare' program units. It seems sensible that we would populate a SymbolTable associated with the FileContainer but I can't remember why we don't do that. Perhaps @sergisiso, @rupertford and @hiker have some thoughts?

rupertford commented 1 year ago

I'm not sure I understand. A FileContainer can contain multiple program units. In Fortran these could be a program, a module, a subroutine, a function (and some data stuff we don't support). So a subroutine would be a Routine within a FileContainer. If we only have one program unit then it is also valid to not have a FileContainer at all. I'm pretty sure there are lots of examples in our tests that only use subroutines, functions or programs and have no modules.

arporter commented 1 year ago

That's all quite true. However, we don't put Symbols for the various routines into a SymbolTable and therefore when one routine calls another one, it can't find a Symbol for it. (@LonelyCat124 hit this while trying to workaround a statement function by introducing a new FUNCTION into the source file instead.)

rupertford commented 1 year ago

Ah, I see. Yes, it makes sense for Routine and Container symbols to be added to parent Container symbol tables including a FileContainer.

LonelyCat124 commented 1 year ago

A simple standalone fail case again:

def test_fail_case(fortran_reader):
    code = '''
    real function my_func(a1, a2)
    real a1, a2
    my_func = a1 * a2
    end function
    subroutine a(b, c, d)
    real b, c, d

    b = my_func(c, d)

    contains
    end subroutine'''
    psyir = fortran_reader.psyir_from_source(code)
LonelyCat124 commented 5 months ago

This may also be required to fix #2156 fully, but I'm not sure on the source code that needs that change. The subroutine prefixes belong to the routineSymbol and if we don't store that anywhere we lose that information.

LonelyCat124 commented 5 months ago

I tried to see if I could have a quite look at how to add a SymbolTable to a FileContainer but I think this will be more complex than just that.

I think fparser2 relies on there being a module present to create the RoutineSymbol - since _process_routine_symbols is only called from generate_container (which I cannot find a single call to other than tests? Any idea what this is for @arporter?) and from _module_handler, which we only reach when finding a Module.

I guess the bit I can't really find is where the FileContainer is created for non-programs - for _program_handler it clearly creates its own FileContainer, but doesn't process any contained routine symbols - I assume since the base assumption is there will be a module.

One option would be in _subroutine_handler to add:

        if(isinstance(parent, FileContainer)):
            _process_routine_symbols(node, parent.symbol_table, {})

and remove the check from the frontend. This at least handles the case for #2156, but I'm not sure if it will break anything else

Does this seem like a reasonable option or is there something elsewhere that this could break?

Edit: Full diff - can ignore

diff --git a/src/psyclone/psyir/backend/fortran.py b/src/psyclone/psyir/backend/fortran.py
index b1b5056a5..34174a622 100644
--- a/src/psyclone/psyir/backend/fortran.py
+++ b/src/psyclone/psyir/backend/fortran.py
@@ -1053,11 +1053,11 @@ class FortranWriter(LanguageWriter):
             with is_program set to True.

         '''
-        if node.symbol_table.symbols:
-            raise VisitorError(
-                f"In the Fortran backend, a file container should not have "
-                f"any symbols associated with it, but found "
-                f"{len(node.symbol_table.symbols)}.")
+        #if node.symbol_table.symbols:
+        #    raise VisitorError(
+        #        f"In the Fortran backend, a file container should not have "
+        #        f"any symbols associated with it, but found "
+        #        f"{len(node.symbol_table.symbols)}.")

         program_nodes = len([child for child in node.children if
                              isinstance(child, Routine) and child.is_program])
diff --git a/src/psyclone/psyir/frontend/fparser2.py b/src/psyclone/psyir/frontend/fparser2.py
index eb7074546..b2ad24ba8 100644
--- a/src/psyclone/psyir/frontend/fparser2.py
+++ b/src/psyclone/psyir/frontend/fparser2.py
@@ -5281,6 +5281,8 @@ class Fparser2Reader():
                 f"ENTRY statements but found '{entry_stmts[0]}'")

         name = node.children[0].children[1].string
+        if(isinstance(parent, FileContainer)):
+            _process_routine_symbols(node, parent.symbol_table, {})
         routine = Routine(name, parent=parent)
         routine._ast = node
LonelyCat124 commented 3 months ago

2575 fixes this behaviour for Subroutines, but leaves programs and (more importantly) Modules as symbol-less.

This raised a question more generally on how we should handle modules, as I don't think now there is a symbol + interface setup to handle this kind of behaviour as a file:

module my_mod
  integer :: x
end module my_mod

subroutine a
use my_mod, only: x
   x = 1
end subroutine a

PSyclone will likely parse and output this file, however i think my_mod will just be an unresolved module and the type of x will be unknown, despite all the information being available in the FileContainer.

To handle this we would need some sort of symbol for Modules in a FileContainer (as I don't think any of the current ContainerSymbol strategies handles this case) and when importing modules we should first check the subroutine's symboltable (which should recurse upwards to the FileContainer) for the relevant module/container symbol alongside this. It could be needed for inlining or something else like this.

I don't think we've come across an immediate use-case for this, but I expect its something we might run into at some point during NG-ARCH