kokkos / stdBLAS

Reference Implementation for stdBLAS
Other
118 stars 22 forks source link

Ambigious Template Definitions For "symmetric_matrix_rank_k_update" #261

Closed yigiter closed 8 months ago

yigiter commented 8 months ago

The definition of symmetric_matrix_rank_k_update given at line 198 and at line 249 lead to ambiguous definitions when non-explicit constructors are allowed for ExecutionPolicy. This causes compilers (gcc 13) to fail with function calls such as this: symmetric_matrix_rank_k_update(-ONE, transposed(A12), A22, t);

mhoemmen commented 8 months ago

@yigiter Greetings and thanks for your interest in this library!

... when non-explicit constructors are allowed for ExecutionPolicy.

Could you please clarify this phrase? Thanks!

yigiter commented 8 months ago

Hi, Thank you for this great library. I hope this will soon become a part of the standard.

I also don't know what I meant by that phrase. I guess I tried to be sound smart, yet failed. It is better that you ignore it.

However, the point is that the following code ends up with a compilation error:

int main() {
    double WA[9]={1,2,3,2,4,5,3,5,6};
    double WB[6]={1,1,1,2,2,2};
    std::mdspan A(WA, std::layout_left::mapping{std::extents<std::size_t,3,3>{}});
    std::mdspan B(WB, std::layout_left::mapping{std::extents<std::size_t,3,2>{}});
    std::experimental::linalg::symmetric_matrix_rank_k_update(1.0, B,A, std::experimental::linalg::upper_triangle);
    return 0;
}

I believe the error message is sufficiently self descriptive:

main.cpp: In function ‘int main()’:
main.cpp:11:111: error: call of overloaded ‘symmetric_matrix_rank_k_update’ is ambiguous
blas3_matrix_rank_k_update.hpp:198:6: note: candidate: void std::experimental::__p1673_version_0::linalg::symmetric_matrix_rank_k_update [with ScaleFactorType = double;] 
blas3_matrix_rank_k_update.hpp:249:6: note: candidate: void std::experimental::__p1673_version_0::linalg::symmetric_matrix_rank_k_update [with ExecutionPolicy = double;]
mhoemmen commented 8 months ago

Thanks for reporting this! I'm working on a fix now.

FYI, the type of B is incorrect. WB represents a packed matrix, for which the correct layout is layout_blas_packed. layout_left would represent unpacked storage. Fixing this would not address the ambiguous overload error, though.

mhoemmen commented 8 months ago

@yigiter PR https://github.com/kokkos/stdBLAS/pull/263 fixes this and related issues. Thank you for contributing the test!

prlw1 commented 2 months ago

I just tried the example given above https://github.com/kokkos/stdBLAS/issues/261#issuecomment-1873645288, in full:

#define MDSPAN_USE_PAREN_OPERATOR 1

#include <mdspan/mdspan.hpp>
#include <experimental/linalg>
#include <fmt/ranges.h>

using namespace MDSPAN_IMPL_STANDARD_NAMESPACE;
using namespace MDSPAN_IMPL_STANDARD_NAMESPACE :: MDSPAN_IMPL_PROPOSED_NAMESPACE;
using namespace MDSPAN_IMPL_STANDARD_NAMESPACE :: MDSPAN_IMPL_PROPOSED_NAMESPACE :: linalg;

void print(auto const& M)
{
        for (int i = 0; i < M.extent(0); ++i) {
                for (int j = 0; j < M.extent(1); ++j) {
                        fmt::print("{} ", M(i,j));
                }
                fmt::print("\n");
        }
        fmt::print("\n");
}

int main()
{
        double WA[9]={1,2,3,2,4,5,3,5,6};
        double WB[6]={1,1,1,2,2,2};
        mdspan A(WA,3,3);
        mdspan B(WB,3,2);
        fmt::print("============== before ==============\n");
        print(A);
        print(B);
        symmetric_matrix_rank_k_update(1.0, B, A, upper_triangle);
        fmt::print("============== after  ==============\n");
        print(A);
        print(B);
}

which results in the following output:

============== before ==============
1 2 3 
2 4 5 
3 5 6 

1 1 
1 2 
2 2 

============== after  ==============
4 6 3 
6 12 5 
3 5 6 

1 1 
1 2 
2 2 

and I don't understand the answer.

I expected 1.0 * B * transpose(B) + A but that clearly isn't the case.

What is symmetric_matrix_rank_k_update computing?

(The original blas function usually has two scalars, so beta*A would have been the additive term - presumably not done here as one can scale A by beta?)