getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.04k stars 65 forks source link

Feature suggestion: SIMD vectorization on CPU #20

Open ogrisel opened 5 years ago

ogrisel commented 5 years ago

Hi, thanks for the presentation you gave at Inria Parietal today ;)

I just wanted to give a heads up on https://github.com/QuantStack/xsimd which might be a useful tool to make kernel computation more efficient on modern CPUs which could be useful for people who don't have an nvidia GPU at hand.

Also you might be interested in xtensor by the same developers who provide a lazy C++ API for n-dimensional array manipulation:

https://github.com/QuantStack/xtensor

And also xeus / cling for interactive C++ development in jupyter notebook:

https://github.com/QuantStack/xeus-cling (interactive demo):

jeanfeydy commented 4 years ago

Hi @ogrisel ,

Thanks a lot for these references, I found them really elegant and exciting... Living in the CUDA bubble, it is easy to miss the latest C++/SIMD developments. Our short-term priority is now the R interface, but our visit last week gave us some incentive to provide a good support for CPU architectures. Now that the KeOps structure is there, getting close to a state-of-the-art CPU parallelization scheme shouldn't be too hard and would allow us to keep up to date with the best practices. As we bypass all memory transfers, it is likely that we could outperform kernel-related numpy codes on CPU: this would be of interest to many users. Of course, we'll keep you updated of our main advances.

Best regards,

Jean