cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.06k stars 4.24k forks source link

Issues related to GPU plugins organisation #29298

Open fwyzard opened 4 years ago

fwyzard commented 4 years ago

Organisation of plugins and libraries

From Matti:

I would also add the plugin organization to the list: should we have separate shared objects for the CPU-only and CUDA EDProducers etc? For now it works, but I think later we'd want to have them separately (although at that point I'd hope scram to help us in some way).

I'm fine with this, but how do we want to implement it, concretely ? Assuming we want to have the original CPU-only code in its own plugin, and a separate plugin for each backend:

If we split the plugin across multiple directories, how do we handle the header files ? Do we move them to the .../include directory even if they are not really part of the "public interface" of the library ? Do we keep them in the original directory, and #include them from the code for the various backends ?

Is it enough to have separate plugins or do we also need separate libraries ?

Do we need to put only the backend code (e.g. device functions and kernels) in the separate plugins, or also the API calls (e.g. cudaMemcpy, kernel launches) ?

Based on our experience I think we may need to:

cmsbuild commented 4 years ago

A new Issue was created by @fwyzard Andrea Bocci.

@Dr15Jones, @smuzaffar, @silviodonato, @makortel, @davidlange6, @fabiocos can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

fwyzard commented 4 years ago

assign heterogeneous

fwyzard commented 4 years ago

assign core

cmsbuild commented 4 years ago

New categories assigned: heterogeneous,core

@Dr15Jones,@smuzaffar,@makortel,@makortel,@fwyzard you have been requested to review this Pull request/Issue and eventually sign? Thanks

smuzaffar commented 4 years ago

I would suggest to keep the plugins in same package/plugins directory. If you have to manage sources for different backend then create package/plugins/<backend>.

I do nit understand the issue with header files. The header files which belongs to plugins are not public headers and they should reside along with the plugin in plugins directory. If there are header files which are public then they should end up in interface directory.

fwyzard commented 4 years ago

Does scram actually support a hierarchy like System/Package/plugins/cuda/ ?

fwyzard commented 4 years ago

What about libraries, can we have System/Package/src built into a library, and System/Package/src/cuda build into a separate library ?

fwyzard commented 4 years ago

The comment about header files refer to a situation where we have System/Package/plugins/common.h used by both System/Package/plugins/plugin.cc and System/Package/plugins/cuda/plugin.cc

With this approach it is reasonably clear that the files under .../plugins/cuda/ can use the header files from .../plugins/.

If we prefer to use a different package altogether, e.g. CUDASystem/Package, it's less clear that the files under CUDASystem/Package/plugins/ can use the include files from System/Package/plugins/ .

smuzaffar commented 4 years ago

Does scram actually support a hierarchy like System/Package/plugins/cuda/

yes by having something like the following in System/Package/plugins/BuildFile.xml

<library name="MyPluginCuda" file="cuda/*.cc">
  <use name="cuda"/>
</library>
fwyzard commented 4 years ago

Finally, the long term strategy is probably not to have System/Package/plugins/ System/Package/plugins/cuda/ System/Package/plugins/some/ System/Package/plugins/other/ ... but more simply System/Package/plugins/ System/Package/plugins/heterogeneous/ and then build the files under plugins/heterogeneous for multiple back ends.

makortel commented 4 years ago

(my impression is that we pretty much agree)

I was thinking we have all source code under package/plugins, and split the binary code to multiple plugins in the BuildFile.xml. The subdirectories for backends could help to write the plugin declarations in BuildFile.xml. The long-term goal would be to have a single source that gets compiled to multiple backend binaries automagically by scram.

makortel commented 4 years ago

My original question was more whether we do the split of CPU-only and CUDA-aware code to different plugin libraries now, or later when bundling all together becomes a problem.

smuzaffar commented 4 years ago

What about libraries, can we have System/Package/src built into a library, and System/Package/src/cuda build into a separate library

libraries are not that simple, how should SCRAM name those libraries. How other pakcage declare a dependency on one or other?

For public libraires I would suggest separate packages but if a shared library is only used by dedicated plugins then why not just create a private library from the plugin directory itself? e.b.

<library name="MyLib" file="lib*.cc">
  <flag EDM_PLUGIN="0"/>
</library>
<library name="MyLibCuda" file="lib*.cc">
  <flag EDM_PLUGIN="0"/>
  <use name="cuda"/>
</library>
<library name="MyPlugin" file"plugin*.cc">
  <lib name="MyLib"/>  #link against MyLib private library
</library>
fwyzard commented 4 years ago

The long-term goal would be to have a single source that gets compiled to multiple backend binaries automagically by scram.

Yes, this is why I was considering something like System/Package/plugins and HeterogeneousSystem/Package/plugins

where the first one is the - mostly untoiched - legacy, cpu-only code, and the second one is a portable code base that we build for multiple architectures.

fwyzard commented 4 years ago

Libraries are not that simple, how should SCRAM name those libraries. How other package declare a dependency on one or other?

I don't know, that's why I am asking :-)

For public libraries I would suggest separate packages but if a shared library is only used by dedicated plugins then why not just create a private library from the plugin directory itself?

Libraries are supposed to be used by other libraries, and by plugins in the same or other packages. So, the only viable alternative is separate packages.

However, what about the case were we have a single code base, e.g. under System/Package/src/ and want to build multiple libraries out of it, with different compilers and/or compiler settings ?

fwyzard commented 4 years ago

My original question was more whether we do the split of CPU-only and CUDA-aware code to different plugin libraries now, or later when bundling all together becomes a problem.

Right now I would lean towards splitting it already now, but let's see.

smuzaffar commented 4 years ago

However, what about the case were we have a single code base, e.g. under System/Package/src/ and want to build multiple libraries out of it, with different compilers and/or compiler settings ?

this is not possible with current build system. We will again have the same issue i.e from all the generated libraries (with eithe rdifferent flags or compilers) only one will be available for linking

makortel commented 4 years ago

The long-term goal would be to have a single source that gets compiled to multiple backend binaries automagically by scram.

Yes, this is why I was considering something like System/Package/plugins and HeterogeneousSystem/Package/plugins

where the first one is the - mostly untoiched - legacy, cpu-only code, and the second one is a portable code base that we build for multiple architectures.

I was thinking more something like the following for the System/Package/plugins/BuildFile.xml

<library file="*.cc" name="SystemPackagePlugins">
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsHeterogeneous"/>
  <flags EDM_PLUGIN="1"/>
  <some special flag to tell scram that these sources need to be compiled for multiple backends, it should also e.g. append a backend-specific postfix to the library name/>
</library>
makortel commented 4 years ago

However, what about the case were we have a single code base, e.g. under System/Package/src/ and want to build multiple libraries out of it, with different compilers and/or compiler settings ?

this is not possible with current build system. We will again have the same issue i.e from all the generated libraries (with eithe rdifferent flags or compilers) only one will be available for linking

@smuzaffar Don't we have the same problem if we'd compile the library e.g. for multiple vectorization levels? Or do you mean with "not possible with current build system" exactly that the build system would need to be extended for such behavior?

smuzaffar commented 4 years ago

For special vectorization flags (controlled via top level config/BuildFile.xml), scram is going to support that for both plugin and libraries. In that case scram is going to generate

lib/arch/libNAME.so 
lib/arch/avx2/libNAME.so 
lib/arch/avx512/libNAME.so 

and we load either one of special vector library of fall back to default one. But supporting random flags will not work.

SCRAM treat cuda a special and generate

lib/arch/libNAME.so
lib/arch/cuda/libName.so

and if cuda is available then prefer lib/arch/cuda over lib/arch.

smuzaffar commented 4 years ago

by the way the above will only work as long as the interface for the library do not change. I mean the symbols with in all flovors of a library are same

fwyzard commented 4 years ago

Mhm, no, that cannot work for this use case - we want to be able to use multiple versions of the same library at the same time, in the same job.

smuzaffar commented 4 years ago

@makortel , why not

<library file="*.cc" name="SystemPackagePlugins">
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend1"/>
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend2"/>
  <flags EDM_PLUGIN="1"/>
</library>
....

but how plugin manager is going to decide which plugin to load? I guess all these files are going to provide same plugins. I think plugin manager will not be happy with that.

makortel commented 4 years ago

SCRAM treat cuda a special and generate

lib/arch/libNAME.so
lib/arch/cuda/libName.so

and if cuda is available then prefer lib/arch/cuda over lib/arch.

I believe with CUDA (and in general for the "heterogeneous backends") we'd want to be able to load both lib/arch/libNAME.so and lib/arch/cuda/libName.so e.g. to be able to run both CPU and GPU flavors in the same process, or run on multiple GPU platforms in the same process (ok the latter may be a unicorn).

makortel commented 4 years ago

why not

<library file="*.cc" name="SystemPackagePlugins">
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend1"/>
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend2"/>
  <flags EDM_PLUGIN="1"/>
</library>
....

If we add a new backend (say a new GPU platform, or GPU platforms become dependent on the CPU architecture) we'd have to go through all such BuildFiles and update them with a similar pattern. That is doable of course, but the cost should be weighted against the cost of extending the build system to do that automagically.

but how plugin manager is going to decide which plugin to load? I guess all these files are going to provide same plugins. I think plugin manager will not be happy with that.

My current idea is for each of them to provide different plugins.

fwyzard commented 4 years ago

My current idea is for each of them to provide different plugins.

Agreed, they would provide something like PluginCPUSerial, PluginCPUParallel, PluginCUDA, PluginOtherBacked, ...

makortel commented 4 years ago

From the framework perspective it appears (https://github.com/cms-sw/cmssw/issues/28576#issuecomment-605681672) that avoiding loading the plugins defining the EDModules would be difficult, so for time being I don't see a strong reason to separate CUDA code from CPU-only code to separate plugins.

(there may be other motivations for the split though, e.g. for the "portability tool" it could be easier to just build the object files and the plugin shared object per backend, instead of building the objects files per backend and then link all of them into a single plugin shared object)

makortel commented 4 years ago

why not

<library file="*.cc" name="SystemPackagePlugins">
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend1"/>
  <flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend2"/>
  <flags EDM_PLUGIN="1"/>
</library>
....

If we add a new backend (say a new GPU platform, or GPU platforms become dependent on the CPU architecture) we'd have to go through all such BuildFiles and update them with a similar pattern. That is doable of course, but the cost should be weighted against the cost of extending the build system to do that automagically.

Let me add that despite of what I wrote above (for the foreseen long run), I do think the "manual way" is a viable way to get started.