yambo-code / yambo

This is the official GPL repository of the yambo code
http://www.yambo-code.eu/
GNU General Public License v2.0
100 stars 39 forks source link

OpenACC GPU porting #149

Open sangallidavide opened 2 weeks ago

sangallidavide commented 2 weeks ago

I open a second issue for discussion on OpenACC GPU porting. Another one is here: https://github.com/yambo-code/yambo/issues/79

Sometimes the code calls a "memory free" before doing allocations This is done via the macro YAMBO_FREE, where the deallocate/devxlib_unmap calls are protected by if allocated.

Example here in https://github.com/yambo-code/yambo/blob/tech-gpu/src/pol_function/X_irredux.F#L250 This call specifically leads to the following line: https://github.com/yambo-code/yambo/blob/tech-gpu/src/Ymodules/mod_collision_el.F#L102 !DEV_ACC exit data delete(ggw)

With gfortran and Openacc this leads to an error. From what I understand, there is no data to be deleted if ggw was not allocated. Although here the logic is not super clear to me.

What is the opposite of !DEV_ACC exit data delete(ggw)? Is it !DEV_ACC enter data create(ggw) ? Why is this not handled via devxlib? Probably because gww is a type ?

bellenlau commented 1 week ago

Don't know if this can help but YAMBO_FREE gives an error also on Leonardo with nvhpc/23.11 (not for older compilers), it complains about deallocating on the host some data that are stilled present on the GPU. No idea why this arise only with latest compiler versions yet. Which error do you observe with gfortran?

The opposite of exit data is enter data, which seem to be used for gww. Regarding gww, if this is a derived datatype, copying/deleting the derived datatype with enter data copies/deletes any statically allocated attributes of the derived datatype but not dinamically allocated attributes if any; these can be copied/deleted with an additional explicit data clause (e.g. enter data(X%array)). The runtime should then do automatically an "attach" operation between the dynamically allocated attribute and the datatype.

Regarding YAMBO_FREE_GPU, which is called before YAMBO_FREE, shouldn't the code check if the data is mapped with devxlib before un mapping with devxlib, instead of checking if the data is allocated (on the host, for OpenACC)? At least, if I understood correctly that devxlib_mapped checks if the data is mapped on the GPU

diff --git a/include/headers/common/y_memory.h b/include/headers/common/y_memory.h
index 160b36c97..c01619d9b 100644
--- a/include/headers/common/y_memory.h
+++ b/include/headers/common/y_memory.h
@@ -197,7 +197,7 @@
 #define YAMBO_FREE_GPU(x) \
   if (.not.allocated(x)) &NEWLINE& call MEM_free(QUOTES x QUOTES,int(-1,KIND=IPL))NEWLINE \
   if (     allocated(x)) &NEWLINE& call MEM_free(QUOTES x QUOTES,size(x,KIND=IPL))NEWLINE \
-  if (     allocated(x)) &NEWLINE& call devxlib_unmap(x,MEM_err)
+  if (devxlib_mapped(x)) &NEWLINE& call devxlib_unmap(x,MEM_err)

 #else
andrea-ferretti commented 1 week ago

Hi Laura,

thanks for raising this point. Apparently the issue is somehow recurrent...

basically, for some reasons the devxlib_mapped function fails, FREE_GPU does not deallocate (since it thinks GPU memory is not allocated), and the final deallocate complains

sangallidavide commented 1 week ago

The opposite of exit data is enter data, which seem to be used for gww.

It is used if elemental collision alloc is called before elementa collision free.

However this does not always happen. See my comment here

Sometimes the code calls a "memory free" before doing allocations Example here in https://github.com/yambo-code/yambo/blob/tech-gpu/src/pol_function/X_irredux.F#L250