headmyshoulder / odeint-v2

odeint - solving ordinary differential equations in c++ v2
http://headmyshoulder.github.com/odeint-v2/
Other
341 stars 101 forks source link

problems with VexCL and controlled steppers #110

Closed ds283 closed 8 years ago

ds283 commented 10 years ago

This may be related to issue #107, although it does not look exactly the same.

I am trying to solve a system of ODEs using VexCL. I encountered the problem described in #107 and am therefore using using the branches 'vexcl-adaptive-steppers' and 'vexcl-mvec-copy' from http://github.com/ddemidov/odeint-v2. I am using the up-to-date master branch of VexCL.

-- My implementation works OK with plain runge_kutta4 or runge_kutta_dopri5 (no adaptive error control) and integrate_times, but fails if I try to use runge_kutta_fehlberg78 because the resulting OpenCL kernel does not build:

ptxas application ptx input, line 2083; error   : Kernel 'vexcl_multivector_kernel' exceeds parameter space limit of 4352 bytes
ptxas fatal   : Ptx assembly aborted due to errors
libc++abi.dylib: terminating with uncaught exception of type cl::Error: clBuildProgram

The kernel it generates is quite long, but I can reproduce it if that would be helpful.

-- When I try to wrap runge_kutta_dopri5 or runge_kutta_fehlberg78 with make_controlled<>, nothing works. The fehlberg78 stepper compiles, but fails to build the kernel on execution in the same way as above. The dopri5 stepper does not even compile, with the error

In file included from /opt/local/include/boost/numeric/odeint.hpp:32:
In file included from /opt/local/include/boost/numeric/odeint/stepper/runge_kutta_dopri5.hpp:34:
/opt/local/include/boost/numeric/odeint/util/same_instance.hpp:40:16: error: no viable conversion from 'const typename boost::proto::detail::enable_binary<deduce_domain, deduce_domain::proto_grammar, boost::mpl::or_<is_extension<const multivector_expression<basic_expr<address_of, list1<const multivector<double, 20> &>, 1> > >, is_extension<const multivector_expression<basic_expr<address_of, list1<const multivector<double, 20> &>, 1> > > >, boost::proto::tag::equal_to, const multivector_expression<basic_expr<address_of, list1<const multivector<double, 20> &>, 1> > &, const multivector_expression<basic_expr<address_of, list1<const multivector<double, 20> &>, 1> > &>::type' (aka 'const multivector_expression<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::equal_to, boost::proto::argsns_::list2<vex::multivector_expression<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::address_of, boost::proto::argsns_::list1<const vex::multivector<double, 20> &>, 1> >, vex::multivector_expression<boost::proto::exprns_::basic_expr<boost::proto::tagns_::tag::address_of, boost::proto::argsns_::list1<const vex::multivector<double, 20> &>, 1> > >, 2> >') to 'bool'
        return (&x1 == &x2);
               ^~~~~~~~~~~~

I have tried to extract a minimal example which reproduces this compilation failure. For a reason that I have been unable to understand, when I take the ODE component from my code and compile it on its own, I reproduce the same pattern of errors (ie., fehlberg78 compiles but won't build the kernel; dopri5 works without error control, but fails with make_controlled<>), but not precisely the same compiler error. This time, for dopri5, the compiler reports

In file included from /Users/ds283/Documents/Code/vexcl-test/vexcl-test/main.cpp:14:
In file included from /opt/local/include/boost/numeric/odeint.hpp:25:
In file included from /opt/local/include/boost/numeric/odeint/util/ublas_wrapper.hpp:23:
In file included from /opt/local/include/boost/numeric/ublas/vector.hpp:19:
In file included from /opt/local/include/boost/numeric/ublas/storage.hpp:21:
In file included from /opt/local/include/boost/serialization/array.hpp:19:
In file included from /opt/local/include/boost/serialization/nvp.hpp:31:
In file included from /opt/local/include/boost/serialization/level.hpp:28:
/opt/local/include/boost/mpl/eval_if.hpp:60:26: error: no type named 'type' in 'boost::range_const_iterator<std::__1::__wrap_iter<double *> >'
    typedef typename f_::type type;
            ~~~~~~~~~~~~~^~~~

In file included from /Users/ds283/Documents/Code/vexcl-test/vexcl-test/main.cpp:14:
In file included from /opt/local/include/boost/numeric/odeint.hpp:25:
In file included from /opt/local/include/boost/numeric/odeint/util/ublas_wrapper.hpp:33:
In file included from /opt/local/include/boost/numeric/odeint/util/state_wrapper.hpp:26:
In file included from /opt/local/include/boost/numeric/odeint/util/resize.hpp:22:
In file included from /opt/local/include/boost/range.hpp:26:
In file included from /opt/local/include/boost/range/functions.hpp:18:
/opt/local/include/boost/range/begin.hpp:49:18: error: no member named 'begin' in 'std::__1::__wrap_iter<double *>'
        return c.begin();
               ~ ^

In file included from /Users/ds283/Documents/Code/vexcl-test/vexcl-test/main.cpp:14:
In file included from /opt/local/include/boost/numeric/odeint.hpp:25:
In file included from /opt/local/include/boost/numeric/odeint/util/ublas_wrapper.hpp:33:
In file included from /opt/local/include/boost/numeric/odeint/util/state_wrapper.hpp:26:
In file included from /opt/local/include/boost/numeric/odeint/util/resize.hpp:22:
In file included from /opt/local/include/boost/range.hpp:26:
In file included from /opt/local/include/boost/range/functions.hpp:19:
/opt/local/include/boost/range/end.hpp:50:22: error: no member named 'end' in 'std::__1::__wrap_iter<double *>'
            return c.end();
                   ~ ^

In file included from /Users/ds283/Documents/Code/vexcl-test/vexcl-test/main.cpp:14:
In file included from /opt/local/include/boost/numeric/odeint.hpp:68:
In file included from /opt/local/include/boost/numeric/odeint/integrate/integrate_times.hpp:27:
/opt/local/include/boost/numeric/odeint/integrate/detail/integrate_times.hpp:89:29: error: indirection requires pointer operand ('int' invalid)
        Time current_time = *start_time++;
                            ^~~~~~~~~~~~~

I am using Xcode 5 and the default Apple-provided clang.

My example is

#define VEXCL_SHOW_KERNELS
#include <iostream>
#include <vector>
#include <boost/numeric/odeint.hpp>
#include "vexcl/vexcl.hpp"
#include "boost/numeric/odeint/external/vexcl/vexcl.hpp"

typedef vex::multivector<double, 20> state;

class ode {
public:
   ode(std::vector<double>& p, double Mp,
       vex::vector<double>& ks,
       vex::multivector<double, 16>& u2)
      : params(p), M_Planck(Mp), k_list(ks), u2_tensor(u2) { }

   void operator()(const state& x, state& dxdt, double t);

private:
   double                        M_Planck;
   std::vector<double>&          params;
   vex::vector<double>&          k_list;
   vex::multivector<double, 16>& u2_tensor;
};

int main() {
   vex::Context ctx(vex::Filter::Type(CL_DEVICE_TYPE_GPU) && vex::Filter::DoublePrecision);

   std::vector<double> p = {1, 1};
   double              M_Planck = 1;

   vex::multivector<double, 16> u2_tensor(ctx, 5);
   vex::multivector<double, 20> x(ctx, 5);

   std::vector<double> k_list = { 1, 1, 1, 1, 1 };
   vex::vector<double> dev_ks(ctx.queue(), k_list);

   ode rhs(p, M_Planck, dev_ks, u2_tensor);

   std::vector<double> times = { 1, 2, 3, 4, 5 };
   for(int i = 0; i < 20; i++) x(i) = 0;

   using namespace boost::numeric::odeint;

//   runge_kutta_dopri5<state> stepper;
//   integrate_times(stepper, rhs, x, times.begin(), times.end(), 1e-10);

   typedef runge_kutta_dopri5<state, double, state, double, vector_space_algebra, default_operations> stepper;
   integrate_times(make_controlled<stepper>(1e-12, 1e-12), rhs, x, times.begin(), times.end(), 1e-10);
}

void ode::operator()(const state& x, state& dxdt, double t) {
  ...
}

It doesn't matter what I put in the () operator of the functor (although I can reproduce it if needed) - these kernels seem to build correctly, and in the cases where I see a compiler error the contents of this function don't seem to affect it.

ddemidov commented 10 years ago

The error with runge_kutta_fehlberg78 says that OpenCL/CUDA limit on total size of kernel parameters has been exceeded. A limit of 4352 bytes means that a kernel may have no more than 544 double or pointer parameters. I am not sure if this is a universal or hardware-dependent limit (because I remember hardware where this limit used to be 256 bytes). ddemidov/vexcl@7bfcbb50dcbcc64cfbbe16d37f6757a984fdfc92 allows user to ask VexCL to split multivector kernel into components by defining the VEXCL_SPLIT_MULTIEXPRESSIONS macro. Could you please try and see if this solves the issue with runge_kutta_fehlberg78?

ddemidov commented 10 years ago

Regarding the error with runge_kutta_dopri5, I've made a pull request (#111) that provides an implementation of same_instance_impl for VexCL containers. @ds283, could you please check if this solves the problem?

ds283 commented 10 years ago

Thanks very much for your speedy response.

For fehlberg78: Yes, defining VEXCL_SPLIT_MULTIEXPRESSIONS allows the kernel to build.

For dopri5: Yes, if I apply de0da9c together with the existing patches for adaptive steppers then it will compile with runge_kutta_dopri5. There seems to be a tiny typo in de0da9c, however: it introduces a new header 'boost/numeric/odeint/external/vexcl/vexcl_same_insatnce.hpp', but I think this should be 'vexcl_same_instance.hpp'.

ddemidov commented 10 years ago

Thanks, I've fixed the typo in the pull request.