issues
search
cp2k
/
dbcsr
DBCSR: Distributed Block Compressed Sparse Row matrix library
https://cp2k.github.io/dbcsr/
GNU General Public License v2.0
135
stars
46
forks
source link
ocl: execution hints and code cleanup
#615
Closed
hfp
closed
2 years ago
hfp
commented
2 years ago
Removed ACC_OPENCL_MALLOC_LIBXSMM, ACC_OPENCL_STREAM_NOALLOC, ACC_OPENCL_EVENT_NOALLOC, and OPENCL_LIBSMM_DEVMATCH_PARAMFILE.
Revised cached device properties (intel_id is now split into intel and uid with the latter useful for other vendors too).
Moved opencl_libsmm_timer_t into c_dbcsr_acc_opencl_config_t, and enable queue-profiling only if timer_device.
Made devinfo thread-specific information. Moved devmatch property into configuration (not device-specific).
Introduced ACC_OPENCL_XHINTS, removed ACC_OPENCL_DISABLE, revised ACC_OPENCL_SHARE.
Default ACC_OPENCL_XHINTS to automatically determine viable setting.
Removed function (c_dbcsr_acc_opencl_stream_is_thread_specific).
Introduced build targets "backend" and "libsmm" (Makefile).
Cleaner decision about SVM like per device/stream, etc.
Fixed issues pointed out by static analysis.
Decide SVM-interop purely at runtime.
Updated device UIDs.