Summary
track_line now available also for TrackJobCl (incl. tests), also via Python
Collecting from a TrackJob is now more fine-grained and can be configured
All code compiled without warning
All Unit-test pass on test-machine
Python requires a reinstallation of pysixtracklib and pysixtracklib_test
Details:
Implement and test the track_line(*) functionality for the OpenCL Track Job as well. this has been tested and confirmed to produce consistent results across all three languages (C99, C++ and Python)
In order to make track_line work with the current ClContext implementation, some functionality concerning the particle set indices had to be replicated. This is not intended to be a permanent solution and should be replaced when the ClContext is restructured and refactored to meet the design outlined in the CUDA architecture
TrackJob's gained some API and internal state to select which buffers (particles, beam-elements, output) are fetched when collect() is issued. Additionally, a logical predicate has been created allowing to
check at run-time whether the TrackJob (positively!) requires a call to collect or not. By virtue of this API, the default is now to only collect the particles and the output buffer (if available) as beam_elements can be collected on demand and/or by enabling the collect operation to also cover them.
NOTE: calling collect should always be possible and should not incur any unreasonable run-time costs (i.e. a NOP on the TrackJobCpu).
NOTE: This API is not yet exposed to the Python representation of the TrackJob
Cosmetic changes to the python files (i.e. calling autopep8 again)
Determined that the vast majority of the time spent during the OpenCL unit-tests is contributed by the run-time compilation of kernel programs, especially for AMDPRO orca64 based architectures. Some early benchmarking and performance optimization has been done for the test_track_elem_by_elem_opencl_c99 unit-test
Summary track_line now available also for TrackJobCl (incl. tests), also via Python Collecting from a TrackJob is now more fine-grained and can be configured All code compiled without warning All Unit-test pass on test-machine Python requires a reinstallation of pysixtracklib and pysixtracklib_test
Details:
Implement and test the track_line(*) functionality for the OpenCL Track Job as well. this has been tested and confirmed to produce consistent results across all three languages (C99, C++ and Python)
In order to make track_line work with the current ClContext implementation, some functionality concerning the particle set indices had to be replicated. This is not intended to be a permanent solution and should be replaced when the ClContext is restructured and refactored to meet the design outlined in the CUDA architecture
TrackJob's gained some API and internal state to select which buffers (particles, beam-elements, output) are fetched when collect() is issued. Additionally, a logical predicate has been created allowing to check at run-time whether the TrackJob (positively!) requires a call to collect or not. By virtue of this API, the default is now to only collect the particles and the output buffer (if available) as beam_elements can be collected on demand and/or by enabling the collect operation to also cover them.
NOTE: calling collect should always be possible and should not incur any unreasonable run-time costs (i.e. a NOP on the TrackJobCpu). NOTE: This API is not yet exposed to the Python representation of the TrackJob
Cosmetic changes to the python files (i.e. calling autopep8 again)
Determined that the vast majority of the time spent during the OpenCL unit-tests is contributed by the run-time compilation of kernel programs, especially for AMDPRO orca64 based architectures. Some early benchmarking and performance optimization has been done for the test_track_elem_by_elem_opencl_c99 unit-test