Closed fwyzard closed 6 years ago
A new Issue was created by @fwyzard Andrea Bocci.
@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
@fwyzard , what is $$($(1)_objdir)/$(1)_cudadlink_nv.$(OBJEXT) and do we need it in $(1)_cudaobjs? I guess these .a are not needed at runtime. So may be better to have then in some separate directory e.g cuda_device/SCRAM_ARCH ??
Sorry, that was a mistake, we don't need to generate that file. I have just updated the description.
A separate directory sounds fine to me. I would use a more generic name, in case we need static libraries for other stuff in the future.
@smuzaffar by the way, when linking a library or plugin, does SCRAM consider only the direct dependencies, or also the indirect (transitive) ones ?
For example, assuming that
when linking libC.so, does SCRAM pass the flags -lB
-lA
, or only -lB
?
For the CUDA device link step, we do need the first approach,i.e. passing all direct and indirect dependencies.
scram pass all i.e. direct and indirect both. By the way, when you say cuda device link step then what step you are talking about? is it the creation of $(1)_cudadlink or final link step for shared lib/plugin which has .cu files too?
I mean the creation of the $(1)_cudadlink.o
file - that should be the only place where we need the _nv.a
libraries.
@fwyzard , I see that you still have $$($(1)_objdir)/$(1)_cudadlink_nv.$(OBJEXT)
for $(1)_cudaobjs
is it correct ? if yes then how to generate this cudadlink_nv.$(OBJEXT) ?
Sorry, I was sure I had removed it everywhere - fixed now.
By the way, I was planning to build a new "patatrack" release later today or tomorrow morning - shall I wait for an update to the build rules and test them at the same time ?
I need more testing, so please go ahead with the patatrack release.
OK, thanks for letting me know.
@fwyzard , can you please check cmssw-config tag V05-07-30 (tip of master branch) along with your changes in https://github.com/cms-sw/cmsdist/pull/4168.
You need to do the following
dev-area/config/SCRAM
with cmssw-config/SCRAM
dev-area/config/BuildFile.xml
with cmssw-config/CMSSW_BuildFile.xml
dev-area/config/toolbox/slc6_amd64_gcc630/tools/selected/cuda.xml
to get the changes of cms-sw/cmsdist#4168@smuzaffar thanks - my simple test case at https://github.com/fwyzard/cmssw/tree/add_CUDA_linking_samples works fine following your instructions.
It does not use CUDA kernels inside CMSSW plugins - I will test that in a couple week's time.
By the way, I have updated https://github.com/cms-sw/cmsdist/pull/4168 to pick up your changes.
@smuzaffar I have run into a problem with the new build rules:
>> Cuda Device Link tmp/slc7_amd64_gcc700/src/RecoPixelVertexing/PixelTrackFitting/test/testEigenGPUNoFit_t/testEigenGPUNoFit_t_cudadlink.o
/data/user/fwyzard/patatrack/build/slc7_amd64_gcc700.patatrack/slc7_amd64_gcc700/external/cuda/9.2.88-gnimlf/bin/nvcc \
-dlink \
-L/data/user/fwyzard/patatrack/build/slc7_amd64_gcc700.patatrack/tmp/BUILDROOT/84efe5fbe8f5b399087b45a7523ae6ab/opt/cmssw/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_2_0_pre6_Patatrack/lib/slc7_amd64_gcc700 \
-L/data/user/fwyzard/patatrack/build/slc7_amd64_gcc700.patatrack/tmp/BUILDROOT/84efe5fbe8f5b399087b45a7523ae6ab/opt/cmssw/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_2_0_pre6_Patatrack/external/slc7_amd64_gcc700/lib \
-L/data/user/fwyzard/patatrack/build/slc7_amd64_gcc700.patatrack/slc7_amd64_gcc700/external/cuda/9.2.88-gnimlf/lib64/stubs \
-lcudadevrt \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_61,code=sm_61 \
-O3 \
-std=c++14 \
--expt-relaxed-constexpr \
--expt-extended-lambda \
-L/data/user/fwyzard/patatrack/build/slc7_amd64_gcc700.patatrack/tmp/BUILDROOT/84efe5fbe8f5b399087b45a7523ae6ab/opt/cmssw/slc7_amd64_gcc700/cms/cmssw/CMSSW_10_2_0_pre6_Patatrack/static/slc7_amd64_gcc700 \
-lRecoPixelVertexingPixelTrackFitting_nv \
--compiler-options '-O2 \
... \
-g \
-std=c++14 \
-fPIC ' \
tmp/slc7_amd64_gcc700/src/RecoPixelVertexing/PixelTrackFitting/test/testEigenGPUNoFit_t/testEigenGPUNoFit.cu.o \
-o tmp/slc7_amd64_gcc700/src/RecoPixelVertexing/PixelTrackFitting/test/testEigenGPUNoFit_t/testEigenGPUNoFit_t_cudadlink.o
nvlink fatal : unexpected object after cudadevrt (target: sm_35)
gmake: *** [tmp/slc7_amd64_gcc700/src/RecoPixelVertexing/PixelTrackFitting/test/testEigenGPUNoFit_t/testEigenGPUNoFit_t_cudadlink.o] Error 1
Looks like the CUDA LDFLAGS (-L... -L... -lcudadevrt) have to go after the .o files.
Or rather, I think we should have
.../testEigenGPUNoFit_t/testEigenGPUNoFit.cu.o
-lRecoPixelVertexingPixelTrackFitting_nv
-lcudadevrt
Indeed, it seems to work if -lRecoPixelVertexingPixelTrackFitting_nv
comes before -lcudadevrt
.
E.g. if we swap the order of
$(call AdjustFlags,$1,,CUDA_LDFLAGS CUDA_FLAGS)
and $(DLINK_LIBDIR) $(call Tool_DependencyDLINK,$1)
within define link_cuda_objs
.I've tested it locally and seems to work, you can see the changes in cms-sw/cmssw-config#66 / cms-sw/cmsdist#4179 .
thanks @fwyzard for the pull request. It is merged now and will be part of today's 11h00 IB/
@fwyzard , does new build rule work as expected? Can you close this issue?
I just got back from a few days away, I'll double check and let you know.
The private release built with these changes works fine, we can close this issue.
A first step should be to fix the call to
nvcc
for linking device code. The current ruleshould become
where we explicitly pass
-dlink
, and also pass the full set of flags to the host compiler. This should go together with the changes to the cuda toolfile at https://github.com/cms-sw/cmsdist/pull/4168 .Then, to avoid some confusion between the existing
$(1)_cudaobj
variable and$(1)_cudaobjs
that I propose to add, I suggest to rename the former to$(1)_cudadlink
(short for device link):Finally, to support linking CUDA device code across different libraries, for each package with CUDA source files we need to:
define a corresponding list of device object files and a static library (I picked the suffix _nv.o because of the ELF sections they are going to have, but feel free to use anything else):
copy the ELF sections
__nv_relfatbin
,__nv_module_id
and.nvFatBinSegment
to the device object files:link these device object files into a static library:
during the device link step, take into account the full dependencies from the BuildFiles, but use these
_nv.a
libraries instead of the standard.so
libraries.One question: do we want to keep these
_nv.a
files in the same place as the.so
libraries, or in a separate directory ? They are only used for linking, not at run time.