torch / torch7

http://torch.ch
Other
8.96k stars 2.38k forks source link

[Feature request] L_p distances in meshgrid on CUDA #1174

Open octavian-ganea opened 5 years ago

octavian-ganea commented 5 years ago

Let's assume I have 2 huge matrices A and B of dimensions a x d and b x d. Let's assume that a matrix of dimensions a x b x d cannot fit on my 16 GB of GPU memory, but a x b can fit. If I do matmul(A, B.t()) on GPU everything works nicely since there is no intermediate a x b x d matrix to be stored. But if I want to do any other operation that would compute instead of A@B.t() , let's say L_p distance between each row of A and each column of B, I need to keep an axbxd intermediate matrix on the GPU.

Would be nice to have this implemented in CUDA in a similar manner with matrix multiplication. Any pointers on how to do this are welcome.