Open fwyzard opened 4 years ago
A new Issue was created by @fwyzard Andrea Bocci.
@Dr15Jones, @smuzaffar, @silviodonato, @makortel, @davidlange6, @fabiocos can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign heterogeneous
assign core
New categories assigned: heterogeneous,core
@Dr15Jones,@smuzaffar,@makortel,@makortel,@fwyzard you have been requested to review this Pull request/Issue and eventually sign? Thanks
I would suggest to keep the plugins in same package/plugins
directory. If you have to manage sources for different backend then create package/plugins/<backend>
.
I do nit understand the issue with header files. The header files which belongs to plugins are not public headers and they should reside along with the plugin in plugins directory. If there are header files which are public then they should end up in interface
directory.
Does scram actually support a hierarchy like System/Package/plugins/cuda/
?
What about libraries, can we have System/Package/src
built into a library, and System/Package/src/cuda
build into a separate library ?
The comment about header files refer to a situation where we have
System/Package/plugins/common.h
used by both
System/Package/plugins/plugin.cc
and
System/Package/plugins/cuda/plugin.cc
With this approach it is reasonably clear that the files under .../plugins/cuda/
can use the header files from .../plugins/
.
If we prefer to use a different package altogether, e.g. CUDASystem/Package
, it's less clear that the files under CUDASystem/Package/plugins/
can use the include files from System/Package/plugins/
.
Does scram actually support a hierarchy like System/Package/plugins/cuda/
yes by having something like the following in System/Package/plugins/BuildFile.xml
<library name="MyPluginCuda" file="cuda/*.cc">
<use name="cuda"/>
</library>
Finally, the long term strategy is probably not to have
System/Package/plugins/
System/Package/plugins/cuda/
System/Package/plugins/some/
System/Package/plugins/other/
...
but more simply
System/Package/plugins/
System/Package/plugins/heterogeneous/
and then build the files under plugins/heterogeneous
for multiple back ends.
(my impression is that we pretty much agree)
I was thinking we have all source code under package/plugins
, and split the binary code to multiple plugins in the BuildFile.xml
. The subdirectories for backends could help to write the plugin declarations in BuildFile.xml
. The long-term goal would be to have a single source that gets compiled to multiple backend binaries automagically by scram.
My original question was more whether we do the split of CPU-only and CUDA-aware code to different plugin libraries now, or later when bundling all together becomes a problem.
What about libraries, can we have System/Package/src built into a library, and System/Package/src/cuda build into a separate library
libraries are not that simple, how should SCRAM name those libraries. How other pakcage declare a dependency on one or other?
For public libraires I would suggest separate packages but if a shared library is only used by dedicated plugins then why not just create a private library from the plugin directory itself? e.b.
<library name="MyLib" file="lib*.cc">
<flag EDM_PLUGIN="0"/>
</library>
<library name="MyLibCuda" file="lib*.cc">
<flag EDM_PLUGIN="0"/>
<use name="cuda"/>
</library>
<library name="MyPlugin" file"plugin*.cc">
<lib name="MyLib"/> #link against MyLib private library
</library>
The long-term goal would be to have a single source that gets compiled to multiple backend binaries automagically by scram.
Yes, this is why I was considering something like
System/Package/plugins
and
HeterogeneousSystem/Package/plugins
where the first one is the - mostly untoiched - legacy, cpu-only code, and the second one is a portable code base that we build for multiple architectures.
Libraries are not that simple, how should SCRAM name those libraries. How other package declare a dependency on one or other?
I don't know, that's why I am asking :-)
For public libraries I would suggest separate packages but if a shared library is only used by dedicated plugins then why not just create a private library from the plugin directory itself?
Libraries are supposed to be used by other libraries, and by plugins in the same or other packages. So, the only viable alternative is separate packages.
However, what about the case were we have a single code base, e.g. under System/Package/src/
and want to build multiple libraries out of it, with different compilers and/or compiler settings ?
My original question was more whether we do the split of CPU-only and CUDA-aware code to different plugin libraries now, or later when bundling all together becomes a problem.
Right now I would lean towards splitting it already now, but let's see.
However, what about the case were we have a single code base, e.g. under System/Package/src/ and want to build multiple libraries out of it, with different compilers and/or compiler settings ?
this is not possible with current build system. We will again have the same issue i.e from all the generated libraries (with eithe rdifferent flags or compilers) only one will be available for linking
The long-term goal would be to have a single source that gets compiled to multiple backend binaries automagically by scram.
Yes, this is why I was considering something like
System/Package/plugins
andHeterogeneousSystem/Package/plugins
where the first one is the - mostly untoiched - legacy, cpu-only code, and the second one is a portable code base that we build for multiple architectures.
I was thinking more something like the following for the System/Package/plugins/BuildFile.xml
<library file="*.cc" name="SystemPackagePlugins">
<flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsHeterogeneous"/>
<flags EDM_PLUGIN="1"/>
<some special flag to tell scram that these sources need to be compiled for multiple backends, it should also e.g. append a backend-specific postfix to the library name/>
</library>
However, what about the case were we have a single code base, e.g. under System/Package/src/ and want to build multiple libraries out of it, with different compilers and/or compiler settings ?
this is not possible with current build system. We will again have the same issue i.e from all the generated libraries (with eithe rdifferent flags or compilers) only one will be available for linking
@smuzaffar Don't we have the same problem if we'd compile the library e.g. for multiple vectorization levels? Or do you mean with "not possible with current build system" exactly that the build system would need to be extended for such behavior?
For special vectorization flags (controlled via top level config/BuildFile.xml), scram is going to support that for both plugin and libraries. In that case scram is going to generate
lib/arch/libNAME.so
lib/arch/avx2/libNAME.so
lib/arch/avx512/libNAME.so
and we load either one of special vector library of fall back to default one. But supporting random flags will not work.
SCRAM treat cuda a special and generate
lib/arch/libNAME.so
lib/arch/cuda/libName.so
and if cuda is available then prefer lib/arch/cuda over lib/arch.
by the way the above will only work as long as the interface for the library do not change. I mean the symbols with in all flovors of a library are same
Mhm, no, that cannot work for this use case - we want to be able to use multiple versions of the same library at the same time, in the same job.
@makortel , why not
<library file="*.cc" name="SystemPackagePlugins">
<flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend1"/>
<flags EDM_PLUGIN="1"/>
</library>
<library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend2"/>
<flags EDM_PLUGIN="1"/>
</library>
....
but how plugin manager is going to decide which plugin to load? I guess all these files are going to provide same plugins. I think plugin manager will not be happy with that.
SCRAM treat cuda a special and generate
lib/arch/libNAME.so lib/arch/cuda/libName.so
and if cuda is available then prefer lib/arch/cuda over lib/arch.
I believe with CUDA (and in general for the "heterogeneous backends") we'd want to be able to load both lib/arch/libNAME.so
and lib/arch/cuda/libName.so
e.g. to be able to run both CPU and GPU flavors in the same process, or run on multiple GPU platforms in the same process (ok the latter may be a unicorn).
why not
<library file="*.cc" name="SystemPackagePlugins"> <flags EDM_PLUGIN="1"/> </library> <library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend1"/> <flags EDM_PLUGIN="1"/> </library> <library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend2"/> <flags EDM_PLUGIN="1"/> </library> ....
If we add a new backend (say a new GPU platform, or GPU platforms become dependent on the CPU architecture) we'd have to go through all such BuildFiles and update them with a similar pattern. That is doable of course, but the cost should be weighted against the cost of extending the build system to do that automagically.
but how plugin manager is going to decide which plugin to load? I guess all these files are going to provide same plugins. I think plugin manager will not be happy with that.
My current idea is for each of them to provide different plugins.
My current idea is for each of them to provide different plugins.
Agreed, they would provide something like PluginCPUSerial
, PluginCPUParallel
, PluginCUDA
, PluginOtherBacked
, ...
From the framework perspective it appears (https://github.com/cms-sw/cmssw/issues/28576#issuecomment-605681672) that avoiding loading the plugins defining the EDModules would be difficult, so for time being I don't see a strong reason to separate CUDA code from CPU-only code to separate plugins.
(there may be other motivations for the split though, e.g. for the "portability tool" it could be easier to just build the object files and the plugin shared object per backend, instead of building the objects files per backend and then link all of them into a single plugin shared object)
why not
<library file="*.cc" name="SystemPackagePlugins"> <flags EDM_PLUGIN="1"/> </library> <library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend1"/> <flags EDM_PLUGIN="1"/> </library> <library file="heterogeneous/*.cc" name="SystemPackagePluginsBackend2"/> <flags EDM_PLUGIN="1"/> </library> ....
If we add a new backend (say a new GPU platform, or GPU platforms become dependent on the CPU architecture) we'd have to go through all such BuildFiles and update them with a similar pattern. That is doable of course, but the cost should be weighted against the cost of extending the build system to do that automagically.
Let me add that despite of what I wrote above (for the foreseen long run), I do think the "manual way" is a viable way to get started.
Organisation of plugins and libraries
From Matti:
I'm fine with this, but how do we want to implement it, concretely ? Assuming we want to have the original CPU-only code in its own plugin, and a separate plugin for each backend:
.../plugins/
directory, and use the BuildFile to implement separate plugins ?plugins
directory, and use a separate directory for each backend ?plugins
directory, and use a single directory to generate the binaries for all backends form a single set of source files ?If we split the plugin across multiple directories, how do we handle the header files ? Do we move them to the
.../include
directory even if they are not really part of the "public interface" of the library ? Do we keep them in the original directory, and#include
them from the code for the various backends ?Is it enough to have separate plugins or do we also need separate libraries ?
Do we need to put only the backend code (e.g. device functions and kernels) in the separate plugins, or also the API calls (e.g.
cudaMemcpy
, kernel launches) ?Based on our experience I think we may need to: