stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
107 stars 28 forks source link

Lowering abstract concepts to language-level PSyIR #851

Open sergisiso opened 4 years ago

sergisiso commented 4 years ago

Back-end visitors are good to encapsulate all the knowledge about a target language (Fortran, C, openCL) in a single location when the PSyIR construct has a relatively direct maping to the language construct and are easy to switch between them to output code in different languages.

However, some PSyclone higher level abstractions (e.g. HaloExchange, Kernels, .. ) often do not have a direct translation in target languages, instead they are an aggregation of basic constructs (a call to a method, some arguments, maybe introducing some symbols). This means that the knowledge about the aggregated basic nodes is common between languages (this is not a problem for the Visitor, it could be done in the base Visitor) and that they may introduce other nodes in-place or symbols in the symbol table (this is problematic for the Visitor because it is expected to be a traversal without modifying structures).

In these cases it would be more appropriate to have two steps. One that transforms in-place each construct that does not have a direct translation to lower (language level) nodes and then the visitor step to output the target language.

This is what PSyclone already did/does with gen_code() which lower to f2pygen and f2pygen which then does a direct translation to Fortran. The difference is that the lowering is to a subset of the IR (not to a different IR) which allows to reason and apply transformations in an unified way and that the new lower IR have multiple target languages not just Fortran.

What is missing is this lower_psyir() method in each PSyIR node that have a base implementation

for child in children:
    child.lower_psyir()

and is re-implemented by the nodes that don't appear in the Visitors (the implementation is very similar to the existing gen_code() but creates PSyIR nodes instead of f2pygen nodes.

Then the base visitor can have this step

def __call__(self, node):
    node.lower_psyir() #Transforms in-place, alternativelly we could do a deep-copy here
    return self._visit(node)

so that any expression like FortranWriter(node) does the two steps.

I think what I said was already discussed and more or less agreed with @rupertford and @arporter I hope this is also what you had in mind. What I changed my opinion a bit is in the communication. At the moment I think the higher-level or language-level IR concepts is something which can be confusing and we should keep it as an internal implementation detail. We should just expose the "PSyIR" concept to the domains as "language level constructs + parallel constructs + other psykal things" and it fits the name. Domains don't need to know about the back-ends, they just need to know that for every Node they extend, they will have to provide a lower_psyir() to translate it to more basic concepts.

arporter commented 4 years ago

Just to make sure I've grasped this correctly:

The 'lowering' happens at code-generation time as part of the application of the visitor? What happens to the resulting lower-level PSyIR after this step? I'm just wondering what happens to our IR of the PSy layer if we generate code (e.g. just to see what it looks like) but then want to continue manipulating the higher-level view...

sergisiso commented 4 years ago

Yes the lowering would happen (at least) during code generation time after the psyclone script with the transformations have been fully executed.

This doesn't mean that it can't also be executed during the script if one knows exactly what s/he is doing:

trans1.apply(node)
trans2.apply(node)
node.lower_psyir()
trans3.apply(node)

If we want to do none destructive code generations we can do

def __call__(self, node):
        newnode = node.deepcopy()
    newnode.lower_psyir()
    return self._visit(newnode)

or use the previous implementation and copy before the visitor: FortranWriter(node.deepcopy())

We can discuss options but note that the deepcopy may be tricky to implement because there are many references other than the children list (e.g. through the symbol table). This is why I am proposing the in-place "destructive" lowering as a simpler implementation at least for now. I did brought up this issue during the cocktail day :)

rupertford commented 4 years ago

At the moment I think the higher-level or language-level IR concepts is something which can be confusing and we should keep it as an internal implementation detail.

So, from a branding point of view you think we should just talk about the PSyIR? I think am happy with that. Certainly from the scientists perspective. However I don't think it is an internal implementation detail from a transformation perspective, i.e. the HPC experts. We need to support the HPC expert in using the different levels (imho).

There is one issue I have with the proposed branding and the current implementation. If we are calling both domain-specific and language level constructs PSyIR then the implementation of that is fragmented in the repo. We have a nice encapsulated structure for the "language level" PSyIR i.e. psyclone/psyir/* and could lift that into its own repo if we wanted. However that is not true for the "domain-specific" parts which I thought we planned to sit in psyclone/domain/<my_domain/*. Would it make sense to push the domain-specific PSyIR stuff (nodes and transformations) into the psyir directory as well?

rupertford commented 4 years ago

Yes the lowering would happen (at least) during code generation time after the psyclone script with the transformations have been fully executed.

This doesn't mean that it can't also be executed during the script if one knows exactly what s/he is doing:

This relates to my previous comment. I don't see how the current transformations approach fits in with treating the PSyIR as a single level by default. We have transformations that only work at the high level and transformations that only work at the lower level. How do we manage this?

Perhaps it would make more sense to have two explicit transformation phases?

rupertford commented 4 years ago

Actually, I don't think what I am saying is at odds with what you have suggested. It would just also need a schedule.lower_psyir() or equivalent if we were giving full control to the HPC expert. However, a controlled implementation for the scripting interface would be to have two script entry points, not just def trans(psy), of which either entry point could be optional.

rupertford commented 4 years ago

Perhaps lower_psyir() might be better called domain_to_core() and we can talk about core PSyIR and domain-specific PSyIR? Or perhaps generic instead of core?

sergisiso commented 4 years ago

There is one issue I have with the proposed branding and the current implementation. If we are calling both domain-specific and language level constructs PSyIR then the implementation of that is fragmented in the repo. We have a nice encapsulated structure for the "language level" PSyIR i.e. psyclone/psyir/ and could lift that into its own repo if we wanted. However that is not true for the "domain-specific" parts which I thought we planned to sit in psyclone/domain/<my_domain/. Would it make sense to push the domain-specific PSyIR stuff (nodes and transformations) into the psyir directory as well?

I was not arguing against the domain-specific separation, maybe we agree but we have a semantic misunderstanding about the higher-lower IR levels, I see we have 3 levels of things:

I agree concepts and transformations in the first level should be encapsulated in psyclone/domain//

I was arguing about not separating semantically the second and third layer even though the second needs the lowering and the third needs the backend visitors, but for the domain and all psyclone/psyir/* functionality there should be no difference between 2nd and 3rd levels.

It would just also need a schedule.lower_psyir() or equivalent if we were giving full control to the HPC expert.

I agree, wouldn't it be enough with the base node implementation that I proposed in the first comment.

Perhaps lower_psyir() might be better called domain_to_core() and we can talk about core PSyIR and domain-specific PSyIR? Or perhaps generic instead of core?

I am happy to change the name but I am not sure about domain_to_core because the second layer which is not domain also needs this method implemented.

rupertford commented 4 years ago

Sounds like we need at least one design telco on this!

I may be agreeing with you but I don't think so.

I have assumed that we would lump together your first and second levels, mainly because they all need the same translation to generic PSyIR. I was thinking of the following conceptual structure ...

psyclone/domain/common/ psyclone/domain//

Going through your level2 examples ...

a halo exchange may have domain-specific halo types so requires a domain-specific node, although I agree the concept is generic. For example DynHaloExchange has x-or-y, cross, and region stencils. I think in gocean we have a separate notation to define haloexchanges e.g. 010101010 for a cross, as we need to capture more complex patterns.

a kernel is part of the PSyKAl separation of concerns which I would argue is a domain-specific concept (albeit a generic one).

global sums are independent of API so I agree that they would naturally sit in your level2 and nowhere else. Although, having said that, how would we know which API to translate to. For example global sums in the LFRic API would be implemented differently to global sums in the gocean API, so we would need an api-specific transformation when mapping from high to low level.

sergisiso commented 4 years ago

Sounds like we need at least one design telco on this!

We do. I am not planning to implement this immediately, I just wanted to start writing it down because we just had multiple informal discussions about it.