Prototype new API for NEMO

arporter commented 7 years ago

Discussions with NEMO developers have highlighted the fact that they really do not want to change their source code. In this issue we will investigate the feasibility (or otherwise) of adapting PSyclone to process 'raw' Fortran conforming to the NEMO coding standards.

arporter commented 7 years ago

I've created branch nemo_trial_api for this work.

arporter commented 7 years ago

Without the PSyKAl separation of concerns we have no PSy layer to generate. However, this also means that we can't apply transformations to the PSy layer and that is currently central to what PSyclone does. We will need to generate a Schedule for the single layer that we now have.

arporter commented 7 years ago

In PSyKAl, it is the PSy layer that contains loops over the mesh and all code related to parallelism. Therefore, it may be better to think of the NEMO code as a hand-written PSy layer from which we will construct a representation for manipulation by PSyclone.

arporter commented 7 years ago

I've added the tra_adv.F90 kernel from github.com/arporter/NEMO-DSL under examples/nemo/eg1. My parse_nemo() routine identifies 'kernels' (2D or 3D loop nests) and replaces the associated objects in the AST with a new, NEMOKernel object. This incorporates both the loop structure and the Fortran statements contained within the loop.

arporter commented 7 years ago

Work on NEMOKernel so that it now stores the loop bounds and the AST objects associated with the body of the innermost loop. Improved the tofortran method so that it now outputs correct Fortran for both the DO loops and the loop body.

arporter commented 7 years ago

At this point I can take the AST produced by fparser2 and replace the loop nests with my own kernel objects. However, I don't yet do any processing of the contents of the loop nest.

In order to produce correct OpenMP (for instance), I need to know the 'intents' of the various variables used within the loop.

arporter commented 7 years ago

Code now uses Habakkuk to identify those variables which are shared and those which are private (in an OpenMP context).

To progress this any further we need to actually generate additional Fortran in the form of OpenMP directives around each kernel.

We also need to add support for the use of array syntax which is very common in NEMO. e.g.:

zwx(:,:,1) = 0.0e0

or,

DO jk = 2, jpk-1   
   zwx(:,:,jk) = tmask(:,:,jk) * ( mydomain(:,:,jk-1) - mydomain(:,:,jk) )
END DO

This could be done by as a first pass over the code/AST with any such examples replaced with explicit DO loops.

arporter commented 7 years ago

In order to transform code in the way that PSyclone does, we need a higher-level AST. In particular, the one produced by fparser2 has no information on child->parent relationships. (i.e. it's not possible to go back up the tree, only down). Ideally we need some structure that is in some way consistent with the existing PSyclone Transformations support.

rupertford commented 7 years ago

I separated PSyclone and fparser1 from each other using fgenerator. The main reason for this was that PSyclone primarily needed to generate code and fparser1 did not support this, although fgenerator also includes rudimentary support for code modification (as that is all we needed) and I always thought of fgenerator as being a code generator and a code modifier.

I think the fgenerator approach is quite a powerful solution and always thought that we would also end up using fparser2 via fgenerator for both code generation and code manipulation.

Essentially fgenerator provides (or at least should provide) the higher level ast you are talking about. It also isolates you from a particular implementation. Having said that, I think there would be a reasonable amount of design and implementation to get a single fgenerator api working for both existing and new code (at least I'm pretty sure that it is difficult for fparser1, but perhaps it would be easier with fparser2).

So, in short, I am suggesting we use your requirements in this issue to inform how we might extend fparser to support what you need.

arporter commented 7 years ago

Code now produces a (crude) high-level view of the tracer kernel:

NEMOSchedule[]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    NEMOKern3D[]
    CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
    NEMOKern2D[]
    Loop[type='levels',field_space='None',it_space='None']
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    Loop[type='tracers',field_space='None',it_space='None']
        NEMOKern3D[]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern3D[]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern3D[]
        NEMOKern3D[]
        NEMOKern3D[]
        NEMOKern3D[]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        Loop[type='levels',field_space='None',it_space='None']
            CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern3D[]
        NEMOKern3D[]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern3D[]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern3D[]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    Loop[type='levels',field_space='None',it_space='None']
        Loop[type='lat',field_space='None',it_space='None']
            Loop[type='lon',field_space='None',it_space='None']
                CodeBlock[<class 'fparser.Fortran2003.Write_Stmt'>]
    CodeBlock[<class 'fparser.Fortran2003.Close_Stmt'>]

arporter commented 7 years ago

Begin work on generating Kernels from implicit loops (that use Fortran array syntax). This now gives:

NEMOSchedule[]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    NEMOKern[3D]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    NEMOKern[2D]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    Loop[type='levels',field_space='None',it_space='None']
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    Loop[type='tracers',field_space='None',it_space='None']
        NEMOKern[3D]
        NEMOKern[Implicit]
        NEMOKern[Implicit]
        NEMOKern[3D]
        NEMOKern[Implicit]
        NEMOKern[Implicit]
        NEMOKern[3D]
        NEMOKern[3D]
        NEMOKern[3D]
        NEMOKern[3D]
        NEMOKern[Implicit]
        NEMOKern[Implicit]
        Loop[type='levels',field_space='None',it_space='None']
            NEMOKern[Implicit]
        NEMOKern[Implicit]
        NEMOKern[3D]
        NEMOKern[3D]
        NEMOKern[Implicit]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern[3D]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
        NEMOKern[3D]
        CodeBlock[<class 'fparser.Fortran2003.Assignment_Stmt'>]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]
    Loop[type='levels',field_space='None',it_space='None']
        Loop[type='lat',field_space='None',it_space='None']
            Loop[type='lon',field_space='None',it_space='None']
                CodeBlock[<class 'fparser.Fortran2003.Write_Stmt'>]
    CodeBlock[<class 'fparser.Fortran2003.Call_Stmt'>]

This reveals one issue - the fact that we have an implicit i-j loop within an explicit loop over levels. This currently results in us identifying a kernel within the loop over levels. In actuality, this should be a 3D kernel.