For-Loops - Githubissues

Hi Mirza, thank you for your interest in CELES! In general, I think there are several places where the code could be improved, performance-wise. Vectorization of for loops can certainly helpful in certain cases. MATLAB should have a built-in profiler which you could use to understand how much it can help to optimize certain parts of the code based on the typical configurations that you intend to study, so that you can focus your efforts on the most relevant bottlenecks and see if they are actually associated with unoptimized for loops.

For instance, when trying to simulate configurations containing thousands of small particles (which is one of the main intended applications for CELES) I think that most of the runtime is spent during the iterative solver routine, performing vector-matrix multiplications which are handled by the following (parallelized) CUDA kernel: src/scattering/coupling_matrix_multiply_CUDA.cu

Different schemes to address this time-consuming step can be envisioned, but implementing them requires a different type of effort. I think that Amos had already tested some alternative solutions such as the fast-multipole method or a rotation-translation-rotation scheme, but for this "superposition" T-matrix implementation they did not eventually turn out to be more efficient than the current brute-force multiplication, I believe.

disordered-photonics / celes

For-Loops #28