Closed kloppstock closed 4 years ago
Due to other similar bugs (see Issue #66), a workaround was implemented which is currently uploaded to the clustering-bug branch.
An uninitialized variable was found in code related to the ClusterKernel (sum in findSumAndMax()). With this fixed, similar bugs did not occur even with the workaround removed.
Due to a bug affecting the cluster kernel, CUDA 10 or higher is currently required. Additionally, extracting the element layer size from the work group size also causes problems (in CUDA 9 and 10) in the cluster kernel. This is currently fixed by passing the Accelerator struct as a template parameter so that the compiler can use this information already at compile time.