Closed cortner closed 2 years ago
In the current implementation of RYlmBasis, the batched version is slower than the serial version + doesn't allow us to @avx. it is completely unclear why?
Should make a general performance study for all the bases implemented here.
this was due to a performance bug in the code - close for now until it comes up again.
In the current implementation of RYlmBasis, the batched version is slower than the serial version + doesn't allow us to @avx. it is completely unclear why?
Should make a general performance study for all the bases implemented here.