getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.04k stars 65 forks source link

Transparent interface for billion-scale matrices #99

Open jeanfeydy opened 3 years ago

jeanfeydy commented 3 years ago

Hi all,

Following my discussion with @alessandro-rudi from @FalkonML (detailed in #98), it appears that providing a simple interface for hard-drive data and multi-GPU computations would considerably reduce the boilerplate for advanced KeOps users. As of today, Falkon performs very large kernel matrix-vector products (say, N=1G, M=100k, D=50) by combining two ingredients:

What do you think? Best regards, Jean

Giodiro commented 3 years ago

Hi,

I was writing a new issue to collect the couple of patches I made to KeOps when integrating with Falkon. I had not posted them before because one of them modifies the API, and I'm not sure if it's worthwile. The motivation behind the two changes was to allow to parallelize large matrix-vector products row-wise when multiple GPUs are available (and enabling the tiling manager which you linked to!)

  1. Add an 'output' parameter to Genred so that the user is responsible for allocating the tensor into which the result is stored.
  2. Release the GIL when calling into CUDA. This allows to call KeOps in parallel from python.

I have only tested the combination of the two features together, so I am not yet sure if the interaction of no GIL, but allowing KeOps to allocate its own output tensor, will work as intended (but I can try to find out!).

I'm going to clean up the code a little bit and post a diff for the changes above in this issue! I think it would be great :) let me know your thoughts about the 'output' parameter, especially since there may be bits of the KeOps code where the change is inconsistent (and I haven't really considered what should happen with the gradient).

Thanks, Giacomo

Giodiro commented 3 years ago

Attached is the diff between my patched version and the master branch at getkeops/keops there are a few cosmetic changes which make the diff not so readable unfortunately, and the code can still be refactored a bit to avoid duplication. The interesting changes are

diff.txt