getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.04k stars 65 forks source link

Question about KMin and ArgKMin reduction in pykeops #310

Open gdurif opened 1 year ago

gdurif commented 1 year ago

I've noticed the following behaviors that I am not sure about (and that cause issues with rkeops) regarding KMin and ArgKMin reduction output dimension.

MWE for KMin:

import numpy as np
from pykeops.numpy import Genred
# Simple formula
formula = "V0"
variables = ["V0=Vi(4)"]
# KMin on Vi
op = Genred(formula, variables, reduction_op="KMin", axis=0, opt_arg=3)
# data
arg = np.random.randn(10, 4)
# Perform reduction (add a dimension)
res = op(arg, backend="CPU")
res.shape
#> (1, 3, 4)
# KMin on Vj
op = Genred(formula, variables, reduction_op="KMin", axis=1, opt_arg=3)
# data
arg = np.random.randn(10, 4)
# Perform reduction (should do nothing?)
res = op(arg, backend="CPU")
res.shape
#> (10, 3, 4)

MWE for Arg_KMin:

import numpy as np
from pykeops.numpy import Genred
# Simple formula
formula = "V0"
variables = ["V0=Vi(4)"]
# ArgKMin on Vi
op = Genred(formula, variables, reduction_op="ArgKMin", axis=0, opt_arg=3)
# data
arg = np.random.randn(10, 4)
# Perform reduction (add a dimension)
res = op(arg, backend="CPU")
res.shape
#> (1, 3, 4)
# ArgKMin on Vj
op = Genred(formula, variables, reduction_op="ArgKMin", axis=1, opt_arg=3)
# data
arg = np.random.randn(10, 4)
# Perform reduction (should do nothing?)
res = op(arg, backend="CPU")
res.shape
#> (10, 3, 4)
joanglaunes commented 1 year ago

Hello @gdurif , This is somewhat normal behavior, even if it is not very intuitive with the particular cases you test. We always assume that we perform a reduction over a symbolic tensor of size $(n_i,n_j,d)$, and the shape of the output shapes for a reduction over Vi is either $(n_j,d)$ for simple reductions like Min or Sum, or $(n_j,k,d)$ if the reduction gives several outputs, as it is the case for the KMin and ArgKMin reductions. And for reductions over Vj, the output shape is $(n_i,k,d)$.

gdurif commented 1 year ago

Hello @joanglaunes

For other reduction (like sum), the "useless" dimension is not added when not necessary. e.g:

import numpy as np
from pykeops.numpy import Genred
# Simple formula
formula = "V0"
variables = ["V0=Vi(4)"]
# Sum on Vi
op = Genred(formula, variables, reduction_op="Sum", axis=0)
# data
arg = np.random.randn(10, 4)
# Perform reduction (add a dimension)
res = op(arg, backend="CPU")
res.shape
#> (1, 4)
# Sum on Vj
op = Genred(formula, variables, reduction_op="Sum", axis=1)
# data
arg = np.random.randn(10, 4)
# Perform reduction (should do nothing?)
res = op(arg, backend="CPU")
res.shape
#> (10, 4)
sum(res - arg)
#> array([0., 0., 0., 0.])

The behavior is different depending on the reduction.

joanglaunes commented 1 year ago

In fact, yes it is added also, that is why you get shape $(1,4)$ and not just $(4)$. The dimension which is dropped in the case of the Sum reduction is the k dimension, because for a sum there is only one output per reduction. In other words the output has always 2 dimensions for Sum or Min reductions, and always 3 dimensions for KMin or ArgKMin.

gdurif commented 1 year ago

Ok, I see, thanks. I'll see how we can manage this in R.