Open samcom12 opened 1 year ago
@samcom12 good idea. I have been thinking about this as well. The first step I think is to convert the inner timestepping procedures to cython C and then invoking openmp in the various expensive procedures. Hopefully we could then move over to offloading to the GPU.
Thanks @stoiver .
I just tried profiling for offloading benefits with sample rectangular
testcase. The results are promising showing almost 4X speedup which can further increase with real datasets we are using.
You can refer the attached report. offload_model.tar.gz
@samcom12, the results look promising. It is good that the offload advisor just picks up 2 procedures, _compute_fluxes_central and _extrapolate_second_order_edge_sw to look at. I think it also suggests that we should look at just using standard openmp on those two procedures. Does intel have an advisor for studying likely improvement using openmp?
Hello @stoiver
There is a tool which converts serial C,C++, and Fortran code to OpenMP based code.
And, with Intel VTune we can do threading analysis to see how effectively threads are being used in the code.
Hi All,
With recent developments like
OpenMP offload
support oropenacc pragma
, Can we port ANUGA to use GPU and accelerate simulations?Cheers, Samir