stfc / PSycloneBench

Various benchmarks used to inform PSyclone optimisations
BSD 3-Clause "New" or "Revised" License
6 stars 5 forks source link

(towards #75) Adds original tracer advection benchmark #76

Closed arporter closed 2 years ago

arporter commented 2 years ago
arporter commented 2 years ago

OpenMP version runs on my desktop but doesn't show any performance benefit. However, OMP is not in the ESIWACE2 deliverable so I'm going to park that for now.

arporter commented 2 years ago

I need to add a checksum output to ease verification.

arporter commented 2 years ago

Now have version with compute moved to subroutine working on GPU. However, can see that we get managed-memory traffic at the start of each compute region:

image

I think this must be because a lot of the work arrays are done as automatic arrays and thus are re-allocated on the GPU each time the subroutine is called.

arporter commented 2 years ago

Made the automatic arrays into module-scoped allocatables that are allocated just once:

image

arporter commented 2 years ago

Presumably @rupertford, this solution won't work for SIR because I now have an allocate in the compute routine itself? I could move it out to an init method for the module?

rupertford commented 2 years ago

Presumably @rupertford, this solution won't work for SIR because I now have an allocate in the compute routine itself? I could move it out to an init method for the module?

I'm not actually sure. It may be OK as you can specify data as being local in the SIR which presumably means scoped within the code generated by SIR. But I've not looked at what gets generated.

arporter commented 2 years ago

It would be good if the CI installed PSyclone and then also built those targets that use it but that's something for another PR.

arporter commented 2 years ago

This is ready for a first review now. Probably one for @rupertford.

arporter commented 2 years ago

Ready for another look from @rupertford now.

arporter commented 2 years ago

OK, this should be ready to go now.