Closed martinschwinzerl closed 3 years ago
Update: buffer handling and external address assignment implemented Tested using the python bindings for CudaTrackJob, working as expected
Update: performance regression test performed
Within the plot, tracking time per particle and per turn is plotted against the number of tracked particles. A somewhat horizontal curve is expected, lower y - values are better. The green crosses refer to a test run in February against the back-then main branch (. The black curve represents the current HEAD of this branch. Within the margin of error of such a cursory evaluation, the new HEAD seems to perform on par or even slightly better as the baseline results from February 2020.
The additional degrees of freedoms for the Settings.cmake
file introduced in PR #123 allow some further analysis, namely to compare different combinations of enabled/disabled beam-elements. Note that the cyan curve is the only one that has the new NS(TriCub)
element enabled.
Further analysis is warranted, but at a first glance, it seems that this PR does not introduce noticeable performance regressions if NS(TriCub)` is disabled (as was expected).
Merged PR #130 and PR #110 into this PR -> will remove the other two to simplify final testing and coordination with pysixtrack
Main Features:
NS(SpaceChargeQGaussianProfile)
andNS(SpaceChargeInterpolatedProfile)
beam-elementsNS(LineDensityProfileData)
element to store a 1D array of values and derivatives and allows to evaluate these values either using linear or cubic spline interpolated.NS(SpaceChargeInterpolatedProfile)
elements can share a singleNS(LineDensityProfileData)
instance → cfexamples/python/spacecharge/ line_density_profile.py
for an example demonstrating the usageNS(SpaceChargeInterpolatedProfile)
/NS(LineDensityProfileData)
implementation uses the same API as theNS(TriCub)
element from #123. Consequently, this PR should be a superset of #123 and will replace itSIXTRL_NOEXCEPT
andSIXTRL_RESTRICT
, hopefully improving the performance especially under C++Required Testing: In order to test this release, please create a new
Settings.cmake
fromSettings.cmake.default
as the changes introduced for PR #123 are also required here and mandate a new Settings format.Any feedback concerning the performance and correctness of this PR would be highly appreciated. In particular, it would be extremly important to verify, that
NS(TriCub)
andNS(TriCubData)
continue to work as expected. Any changes not already present in the current iterationof PR #123 should be incorporated in this pull-requestNS(SpaceCharge*)
elements have been ported directly from the pysixtrack implementation (cf. SixTrack/pysixtrack#50). It is very likely that they still contain bugs since we do not currently have any tests verifiying the correctness of the space-charge implementationsNS(TriCub)
elements should remain roughly where it is with respect to the current main branch.NS(SpaceCharge*)
elements is not set in stone - any feedback would be highly appreciatedRegressions and Bugs: All tests seem to pass at least on my development machine. However, there is an issue with the OpenCL implementation provided by a recent version of the
Intel OneAPI
(cf. issue #131) which is also effecting the currently release version and which should be investigatedTODOs before merging: In addition to testing and benchmarking (cf. above, thank you again), the remaining tasks are to
NS(SpaceChargeInterpolatedProfile)
does work also using a Cuda TrackJobIn Closing, Additional Remarks, and Minor Changes Not Highlighted So Far: This is a massive change, touching and potentially breaking a lot of differnt parts of the implementation. I am sorry for the sheer volume but I was not very successful in landing these changes in a more digestible fashion. In addition to the changes outlined above (And to what is presented / described in #123), the following additional changes are part of this PR:
NS(Multipole)
,NS(RFMultipole)
,NS(BeamBeam4D)
,NS(BeamBeam6D)
) now use a 64Bit integer to store the address rather than having the pointer directly as a datamember. This ensures that storing elements into aCBuffer()
on architectures with 32bit pointers produces a binary representation also usable on 64 Bit systems. It also gives additional flexibility concerning the memory region of theNS(Multipole)
andNS(RFMultipole)
elements to be consistent with their counterparts in the python bindings of SixTrackLib and pysixtracksixtracklib/common/internal/math_constants.h
,sixtracklib/common/internal/physics_constants.h
which provide a templated C++ and a default C implementation for commonly used constants. This is a stepping stone to a type-traits driven version of these files used in a more recent developent version of SixTrackLib, allowing the use of types other thandouble
double
for floating-point valued quantitiesSIXTRL_UNUSED()
macro to properly handle unused / optional method parameters, avoiding warnings and casts using (void) in the method bodydouble
via C++ templatesgoogletest
(cf. issue #124). This PR continues with the attempts to fix all remaining tests which are affectedNOTES: