Clean up the implementation of single source for CUDA and C++

valassi commented 3 years ago

As discussed with @oliviermattelaer at the meeting and afterwards, there are some details which may be improved.

I chose to implement the single source for cuda/c++ as being a .cc file which is then symlinked as .cu. Examples:

In https://github.com/madgraph5/madgraph4gpu/blob/33bac035c2931e878fe982797e5c38711e552118/epoch1/cuda/ee_mumu/src/grambo.cu : grambo.cu is a symlink to rambo.cc
In https://github.com/madgraph5/madgraph4gpu/blob/33bac035c2931e878fe982797e5c38711e552118/epoch1/cuda/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/gCPPProcess.cu : gCPPProcess.cu is a symlink to CPPProcess.cc
In https://github.com/madgraph5/madgraph4gpu/blob/33bac035c2931e878fe982797e5c38711e552118/epoch1/cuda/ee_mumu/SubProcesses/P1_Sigma_sm_epem_mupmum/gcheck.cu : gcheck.cu is a symlink to check.cc

The first point to cross check is whether a symlink is really necessary. I have a vague recollection that doing "nvcc check.cc" compiles the file as c++ (#ifdef CUDA is false) while "nvcc gcheck.cu" comoiles it as cuda (#ifdef CUDA is true), but I may be wrong.

About this point, note also teh slightly different approach used by ALICE, https://indico.cern.ch/event/962110/contributions/4047367, where a common.cpp file is then included by wrappers of different flavors.

The second point is that I also chose to use different namespaces for the same function/symbols, depending on whether they are used in C++ or CUDA. The gpu symbol starts with g, and so does the gpu file, so for instance rambo.cc but grambo.cu with a leading g. This is because in my initial tests using the same namespace would lead to duplicated symbols. But probably it was just an error in the Makefiles, where some C++ files were linked to cuda executables even if this was not needed.

Anyway, personaly I think neither of these issues is urgent (except if it complicates Olivier's porting upstream to code-generating code, and probably it does). Clearly this is ugly and needs cleanup, but in the initial development phase keeping things physically separated for cuda and c++ was an effective hack.

valassi commented 3 years ago

One third point I forgot to mention: Olivier, you asked via skype why the cpp file has some device specifiers.

In my original implementation the device was protected with an #ifdef CUDA, which was certainly much more explicit, but required an additional #ifdef.

Then @hageboeck suggested to have fewer #ifdefs, by adding an empty "#define device" for cpp (ie if #ifedf CUDA is false. This is https://github.com/madgraph5/madgraph4gpu/pull/44.

Thinking back about it, I see that this is confusing (it confused you, Olivier). We can go for three options:

Keep as is, with an empty device on cpp
Go back to the explicit #ifdef protecting __device only on cuda
Again, learn from ALICE and define another symbol (in their case DECL) which is empty on c++ and equals device on cuda. Actually ALICE have this DECL include also the return type (void in the example above).

I would say, something to be discussed and cleaned up eventually. But for the moment let's keep something that works without overdesigning. It is quite likely that any design we put here may need to be changed later when we add more and more pieces...

valassi commented 3 years ago

Renamed as 'clean up the single source implementation'.

Just listing here a few pending points (very low priority)

better choices of file names and namespaces as mentioned above
get rid of .cu to .cc symlinks, use the nvcc option to treat cc as cu instead? as discussed in issue #139

madgraph5 / madgraph4gpu

Clean up the implementation of single source for CUDA and C++ #54