ddemidov / vexcl

VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP
http://vexcl.readthedocs.org
MIT License
701 stars 81 forks source link

std::complex support #233

Open PhilipVinc opened 6 years ago

PhilipVinc commented 6 years ago

Hello,

Judging from the source code of VexCL there is no support for complex numbers. I've got some Stocastic Differential Equation (Monte Carlo) solvers implemented in C++ that I would like to see how it performs on OpenCL.

How hard would it be to add support for complex and eventually complex to VexCL? I could work on it in my spare time and would eventually be available to contribute a PR, but I would need some guidance while delving inside VexCL, if you would be available to give some implementation hints.

ddemidov commented 6 years ago

I'll write about the OpenCL backend. Same should be applicable to the CUDA and OpenMP (JIT) backends. In fact, with both of the latter you could probably just #include <complex> in the kernel header and work with std::complex<T> directly.

OpenCL does not have native support for complex numbers, but you could use cl_double2 to model it. cl_double2 is binary compatible with std::complex<double> and so transfers between the host and the device memories are straightforward. The main difference is that multiplication and division operations are element-wise for OpenCL vector types like cl_double2. So you would have to introduce custom functions to do those. Here is a simple example:

#include <iostream>
#include <complex>
#include <vector>

#include <vexcl/vexcl.hpp>

//---------------------------------------------------------------------------
// Complex multiplication:
//---------------------------------------------------------------------------
VEX_FUNCTION(cl_double2, cmul, (cl_double2, a)(cl_double2, b),
    double2 r = {a.x * b.x - a.y * b.y, a.x * b.y + a.y * b.x};
    return r;
);

//---------------------------------------------------------------------------
// Complex division:
//---------------------------------------------------------------------------
VEX_FUNCTION(cl_double2, cdiv, (cl_double2, a)(cl_double2, b),
    double d = b.x * b.x + b.y * b.y;
    double2 r = {(a.x * b.x + a.y * b.y) / d, (a.y * b.x - a.x * b.y) / d};
    return r;
);

int main() {
    vex::Context ctx(vex::Filter::Env);
    std::cout << ctx << std::endl;

    const int n = 16;

    // Host side input vectors:
    std::vector<std::complex<double>> x(n), y(n);
    for(int i = 0; i < n; ++i) {
        x[i] = {i, n-i};
        y[i] = {n-i, i};
    }

    // Transfer host-side complex numbers into device-side cl_double2 vectors:
    vex::vector<cl_double2> X(ctx, n, reinterpret_cast<cl_double2*>(x.data()));
    vex::vector<cl_double2> Y(ctx, n, reinterpret_cast<cl_double2*>(y.data()));

    // Do device-side multiplication and division:
    vex::vector<cl_double2> T(ctx, n);

    T = cmul(X, Y);
    std::cout << "X * Y = " << T << std::endl;

    T = cdiv(X, Y);
    std::cout << "X / Y = " << T << std::endl;
}

You could also introduce other helper functions, like get_real(c), get_imag(c), or make_complex(x,y). Would this work for your needs?

ddemidov commented 6 years ago

See also #145, #208.

PhilipVinc commented 6 years ago

Thanks a lot. This indeed gives me an idea, but I need support for complex sparsematrix-vector multiplication as well as complex FFT.

From the documentation and the source code in vexcl/fft/kernels.hpp I see that you support complex FFT, but how would I go about supporting sparsematrix-vector multiplication?

ddemidov commented 6 years ago

There is an example of complex spmv implementation in amgcl/backend/vexcl_complex.hpp. This relies on AMD OpenCL 1.2, which supports C++ in kernels, and so is not portable. There is also a portable, pure C implementation of block-valued spmv at amgcl/backend/vexcl_static_matrix.hpp. This specializes vex::sparse::ell<> class template for small static matrices as value type. You could probably use either of the examples as a starting point for complex spmv.

ddemidov commented 6 years ago

The following should be enough for complex SpMV in vexcl:

#include <iostream>
#include <vector>
#include <complex>

#include <vexcl/vexcl.hpp>

// The following enables use of std::complex<T> in vexcl expressions.
// It simply translates std::complex<T> on the host side into opencl vector
// types (float2/double2) on the device side.
//
// However, semantics of operations like multiplication, division, or
// math functions will be wrong. This is the main reason I hesitate doing
// this in the core library.
namespace vex {

template <typename T>
struct is_cl_native< std::complex<T> > : std::true_type {};

template <typename T>
struct type_name_impl< std::complex<T> >
{
    static std::string get() {
        std::ostringstream s;
        s << type_name<T>() << "2";
        return s.str();
    }
};

template <typename T>
struct cl_scalar_of< std::complex<T> > {
    typedef T type;
};

} // namespace vex

// Now we specialize a template from <vexcl/sparse/spmv_ops.hpp> that allows
// vexcl to generate semantically correct smpv kernels for complex values:
namespace vex {
namespace sparse {

template <typename T>
struct spmv_ops_impl<std::complex<T>, std::complex<T>>
{
    static void decl_accum_var(backend::source_generator &src, const std::string &name)
    {
        src.new_line() << type_name<T>() << "2 " << name << " = {0,0};";
    }

    static void append(backend::source_generator &src,
            const std::string &sum, const std::string &val)
    {
        src.new_line() << sum << " += " << val << ";";
    }

    static void append_product(backend::source_generator &src,
            const std::string &sum, const std::string &mat_val, const std::string &vec_val)
    {
        src.new_line() << sum << ".x += "
            << mat_val << ".x * " << vec_val << ".x - "
            << mat_val << ".y * " << vec_val << ".y;";
        src.new_line() << sum << ".y += "
            << mat_val << ".x * " << vec_val << ".y + "
            << mat_val << ".y * " << vec_val << ".x;";
    }
};

} // namespace sparse
} // namespace vex

int main() {
    vex::Context ctx(vex::Filter::Env && vex::Filter::Count(1));
    std::cout << ctx << std::endl;

    // 4x4 diagonal matrix in CSR format:
    std::vector<int> ptr = {0,1,2,3,4};
    std::vector<int> col = {0,1,2,3};
    std::vector<std::complex<double>> val = {
        {1.0, 1.0}, {2.0, 2.0}, {3.0, 3.0}, {4.0, 4.0}};

    // complex vector:
    std::vector<std::complex<double>> x = {
        {1.0, 1.0}, {1.0, 1.0}, {1.0, 1.0}, {1.0, 1.0}};

    // Device-side matrix and vectors:
    vex::sparse::matrix<std::complex<double>> A(ctx, 4, 4, ptr, col, val);
    vex::vector<std::complex<double>> X(ctx, x);
    vex::vector<std::complex<double>> Y(ctx, 4);

    Y = A * X;

    std::cout << Y << std::endl;
}
henryiii commented 6 years ago

Boost compute supports complex, by the way.

ddemidov commented 6 years ago

Boost compute supports complex, by the way.

Well, the support is to the same extent vexcl allows to use double2 as complex values. std::complex<double> is converted to double2 on device, but multiplication and division operations are wrong. For example, the following snippet gives 0:1 while the correct answer is -1:0:

#include <iostream>
#include <complex>
#include <boost/compute.hpp>

namespace compute = boost::compute;

int main() {
    compute::command_queue bcq = compute::system::default_queue();
    using compute::lambda::_1;

    const int n = 16;
    std::complex<double> i{0,1};

    compute::vector<std::complex<double>> x(n, bcq.get_context());
    compute::fill(x.begin(), x.end(), i);
    compute::transform(x.begin(), x.end(), x.begin(), _1 * _1);

    std::complex<double> v = x[0];
    std::cout << real(v) << ":" << imag(v) << std::endl;
}
henryiii commented 6 years ago

Okay, that is not intuitive or ideal, I agree. How about just adding the example you had to the examples? VexCL is a bit short on examples.

ddemidov commented 6 years ago

How about just adding the example you had to the examples?

I'll try to find time to do this soon. Or, if you are up to it, a PR would be welcome :).