Open arporter opened 3 years ago
Hi @deardenchris, I've made the change to examples/nemo/scripts/kernels_trans.py
on the 1227_assigns_in_kernels_region
branch. Would you mind trying it on (at least) lib_mpp.f90
and see whether it does a better job?
Hi @arporter yes sure I'll give this a try and report back here when done
Hi @arporter, I've give this a try on lib_mpp.f90
but I get the following error:
Transforming invoke mynode: Failed to transform nodes: {0} [<psyclone.psyir.nodes.assignment.Assignment object at 0x10c73ed30>] Error was: 'Transformation Error: A kernels transformation must enclose at least one loop or array range but none were found.' Failed to transform nodes: {0} [<psyclone.psyir.nodes.assignment.Assignment object at 0x10c74b080>] Error was: 'Transformation Error: A kernels transformation must enclose at least one loop or array range but none were found.' Failed to transform nodes: {0} [<psyclone.psyir.nodes.assignment.Assignment object at 0x10c74b4e0>] Error was: 'Transformation Error: A kernels transformation must enclose at least one loop or array range but none were found.' Failed to transform nodes: {0} [<psyclone.psyir.nodes.assignment.Assignment object at 0x10c74b668>] Error was: 'Transformation Error: A kernels transformation must enclose at least one loop or array range but none were found.' Generation Error: Generator: script file './kernels_trans.py' raised the following exception during execution ... { File "./kernels_trans.py", line 693, in trans add_kernels(sched.children) File "./kernels_trans.py", line 498, in add_kernels success = try_kernels_trans(node_list) File "./kernels_trans.py", line 613, in try_kernels_trans invokesched = nodes[0].ancestor(NemoInvokeSchedule) IndexError: list index out of range } Please check your script
Ah. I shall re-educate the Transformation too!
Hi Chris, if you update your copy of this branch then hopefully it will work for you. There is a chance that it may not because, in practice, this change has quite big implications for how far down the tree we attempt to walk. I've already had to extend the script to ignore the 'mynode' routine as putting OpenACC in there didn't make sense.
Currently the NEMO kernels transformation script only encloses nodes within a KERNELS region if one or more of them represents a loop. However, when using managed memory, we need every assignment to happen on the GPU (to avoid page faults) and therefore this restriction needs to be relaxed.