stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
107 stars 28 forks source link

PSyclone backend reaches recursion limit visiting a 500 depth nested if-else block. #2691

Closed sergisiso closed 1 month ago

sergisiso commented 2 months ago

This comes from a cannonicalised SELECT CASE with 500 cases in UKCA ukca_aero_ctl.F90 . The frontend, and exploring the tree works fine, but the backend (even when disabling the copy, is failing with RecursionError: maximum recursion depth exceeded )

sergisiso commented 2 months ago

@MarkUoLeeds A simple solution for this is to increase the python limits in the psyclone script:

import sys

def trans(psyir):
    if psyir.name == "ukca_aero_ctl.F90":
        sys.setrecursionlimit(2000)

   # Rest of the script

Another solution is to simplify the SELECT CASE statement in the original code, I noticed that most of the case bodies are the same, I think the same construct can be written as: case(expr1, expr2, expr3) body instead of:

case(expr1) body
case(expr2) body
case(expr3) body
...

which won't created the deeply nested ifs.

MarkUoLeeds commented 2 months ago

Hello Sergi I am strongly considering revising ukca_aero_ctl.F90 as it is the main entry to the aerosol function. it is where I transformed the whole atmosphere on one MPI task into "chunks" that are then parallelised by OpenMP. I'd rather by-pass it for LFRic_atm as there will be only one chunk for the GPU to gobble. My approach in 2014 made sense for CPUs with low numbers of threads (<64) as the number of chunks is rarely greater than 4 (depending on MPI decomposition. It can be forced to one column per chunk so would result in (number of longitude columns)x(number of latitude columns).

Throughout this subroutine there is a lot of (now for lfric) superfluous re-ordering of elements in the arrays from planes parallel to the surface into columns of vertical atmosphere.

I do not agree that the body of the case is always the same. Maybe we can chat on a video call about that.

Thank you for investigating this aspect of PSyclone. I am sure UKCA glomap will continue to provide you with challenging legacy code.

Mark

Dr. Mark Richardson Technical Head of CEMAC ( www cemac ac uk ) Room 9.145b Earth and Environment Building School of Earth and Environment University of Leeds LS2 9JT


From: Sergi Siso @.> Sent: 20 August 2024 15:01 To: stfc/PSyclone @.> Cc: Mark Richardson @.>; Mention @.> Subject: Re: [stfc/PSyclone] PSyclone backend reaches recursion limit visiting a 500 depth nested if-else block. (Issue #2691)

CAUTION: External Message. Use caution opening links and attachments.

@MarkUoLeedshttps://github.com/MarkUoLeeds A simple solution for this is to increase the python limits in the psyclone script:

import sys

def trans(psyir): if psyir.name == "ukca_aero_ctl.F90": sys.setrecursionlimit(2000)

Rest of the script

Another solution is to simplify the SELECT CASE statement in the original code, I noticed that most of the case bodies are the same, I think the same construct can be written as: case(expr1, expr2, expr3) body instead of:

case(expr1) body case(expr2) body case(expr3) body ...

which won't created the deeply nested ifs.

— Reply to this email directly, view it on GitHubhttps://github.com/stfc/PSyclone/issues/2691#issuecomment-2298938221, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGHT4VSGR4ZCKPLCZ6XXXZTZSNDZ7AVCNFSM6AAAAABMZWTVOCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJYHEZTQMRSGE. You are receiving this because you were mentioned.Message ID: @.***>

sergisiso commented 2 months ago

@MarkUoLeeds You are probably right about the SELECT CASE structure, I only gave a quick look at the code to see where psyclone was having trouble, and regardless we aim to make psyclone agnostic to the chosen input syntax. I am happy to have a video call if it's helpful (does this week work for you? I am on annual leave next week).

A few more notes regarding the possible fix for this issue: