xtensor-stack / xtensor

C++ tensors with broadcasting and lazy computing
BSD 3-Clause "New" or "Revised" License
3.33k stars 398 forks source link

xt::variance / xt::mean with axis returns different dimension #2005

Open Comeylo opened 4 years ago

Comeylo commented 4 years ago

Hello all,

"xt::variance / xt::mean with axis returns different dimension"

#include <xtensor/xarray.hpp>
#include <xtensor/xview.hpp>

int main(int argc, char **argv)
{

    xt::xtensor<double, 2> channels = {{1,2,3}, {4,5,6}};
    auto mean     = xt::mean(channels, 1);
    auto variance = xt::variance(channels, 1); // or xt::stddev
    std::cerr << mean.dimension() << std::endl;
    std::cerr << variance.dimension() << std::endl;
    return 0;
}

returns :

1
0

Is it the expected behavior ?

xtensor 0.21.4 hc9558a2_0 conda-forge xtensor-blas 0.17.2 h776b511_0 conda-forge xtensor-fftw 0.2.6 hc9558a2_0 conda-forge xtl 0.6.13 hc9558a2_0 conda-forge

Cheers.

Côme

JohanMabille commented 4 years ago

That's a bug ;)

JohanMabille commented 4 years ago

Ok so actually it's not a bug, for the variance the axes must be specified as containers or initializer list:

 auto variance = xt::variance(channels, {1});

Otherwise the integer is interpreted as the degree of freedom and the call is equivalent to computing the variance on the while expression.

I agree that this API is counter-intuitive and error-prone, we need to improve it but that will require a backward incompatible change.

@serge-sans-paille @zhujun98 do you think that having a dedicated type for the degree of freedom would be an acceptable solution? This would give code like the following:

auto variance = xt::variance(channels, 1, DDOF(1));

I think DDOF can be implicitly converted to an integer type, the SFINAE will avoid to get it selected for the axis argument.

Comeylo commented 4 years ago

Actually the syntax is well described in the doc xt::xarray<int> r0 = xt::variance(a, {1}); : and also in this issue : https://github.com/xtensor-stack/xtensor/issues/1684 where there is no report of issues with dimensions.

Is there somewhere in the doc, a place where we can have a description of the reduction functions signature ? Because I can find this : https://xtensor.readthedocs.io/en/latest/api/reducing_functions.html#_CPPv4I000_N3xtl13check_conceptI18is_reducer_optionsI3EVSEEEEN2xt8varianceEDaRR1E3EVS. But it seems that there are some other signatures.

Thank you for the explanation nonetheless, it's clearly not a priority.

JohanMabille commented 4 years ago

It looks like we forgot to update the documentation, all the overloads should be listed. I will also add this example as common pitfall in the doc.

zhujun98 commented 4 years ago

@JohanMabille Sorry for the late response. I did not expect to be '@' here :-)

Having a dedicated type for DDOF is of course a solution.

Another possible solution just came to me, which is removing the following overload

inline auto variance(E&& e, D const& ddof, EVS es = EVS())

since the use case of requiring a DDOF is less frequent. After removing the above overload, given

xt::xtensor<double, 2> channels = {{1,2,3}, {4,5,6}};

, the only case which requires extra typing is

auto variance = xt::variance(channels, {1, 2}, 1);

Of course, some extra efforts are needed to make the following two cases compile

auto variance = xt::variance(channels, 1} // 1 is axis
auto variance = xt::variance(channels, 0, 1}