ORNL / cpp-proposals-pub

Collaborating on papers for the ISO C++ committee - public repo
26 stars 27 forks source link

subspan from current prototyping implementation is not compiling #62

Closed keryell closed 6 years ago

keryell commented 6 years ago

I am playing with the implementation from https://github.com/hcedwar/ornl-cpp-proposals-pub/blob/master/P0009/prototype/include/mdspan It works for me with mdspan but I cannot use subspan(). It is not trivial to understand why.

Here is the example from D0009r8 followed by an example from me.

#include <iostream>
#include <mdspan>

using namespace std::experimental::fundamentals_v3;

int main(int argc, char *argv[]) {
  /* Example from D0009r8 */
  // Create a mapping
  typedef extents<3,dynamic_extent,7> Extents3D;
  layout_right::template mapping<Extents3D> map_right(10);

  // Allocate a basic_mdspan
  int* ptr = new int[3*8*10];
  basic_mdspan<int,Extents3D,layout_right> a(ptr,map_right);

  // Initialize the span
  for(int i0=0; i0<a.extent(0); i0++)
    for(int i1=0; i1<a.extent(1); i1++)
      for(int i2=0; i2<a.extent(2); i2++)
        a(i0,i1,i2) = 10000*i0+100*i1+i2;

  // Create Subspan
  auto a_sub = subspan(a,1,std::pair<int,int>(4,6),std::pair<int,int>(1,6));

  // Print values of subspan
  for(int i0=0; i0<a_sub.extent(0); i0++) {
    for(int i1=0; i1<a_sub.extent(1); i1++)
      std::cout << a_sub(i0,i1) << " ";
    std::cout << std::endl;
  }

  /* Output
     10401 10402 10403 10404 10405
     10501 10502 10503 10504 10505
  */

  /* Some other example */
  using space = mdspan<double, 10, 20, 30>;
  double a[10*20*30];
  const space s { a };
  auto t = subspan(s, all, all, all);
  auto u = subspan(s, 1, 2, 3);
}
hcedwar commented 6 years ago

I haven't gotten around to the subspan implementation in this prototype, is currently '#if 0' https://github.com/hcedwar/ornl-cpp-proposals-pub/blob/master/P0009/prototype/include/mdspan#L851

Subspan is an involved (time consuming) bit of meta-programming which I did in Kokkos and an earlier mdspan prototype: https://github.com/kokkos/array_ref/blob/master/reference/test/test_mdspan.cpp#L298

keryell commented 6 years ago

Ah, I have not noticed this #if 0... So if I replace it by #if 1 it should work, right? :-) I have just tried actually, and... no, it still does not work. ;-)

Yes, I understand this requires complex piece of metaprogramming to have all this working. :-( Nowadays I use Boost.Hana to simplify my metaprogramming work. While it would not be suitable for an integration into some official C++ library, perhaps Boost.Hana could simplify the prototyping/bike-sheding part of the proposal?

crtrott commented 6 years ago

We actually got working prototypes lying around and will try to fix the one in the public repo by the end of the week.

keryell commented 6 years ago

Any progress? I started some developments 2 weeks ago expecting this feature would "just work"™ :-) and it would be nice I can use it at some point. :-) Thanks.

crtrott commented 6 years ago

Hi. It should work now. I have some simple gtest based checks in there, if you find bugs just post and we try to fix them ASAP and also include the use case in the tests so it doesn't get reintroduced.

crtrott commented 6 years ago

Sorry this took so long, we are in "fiscal-year-ending-stress" and everything is coming due now ...

crtrott commented 6 years ago

There are still bugs when trying to compile your example. I will try and fix that. One is that subspan right now requires integer slice specifiers (i.e. things which are not pair or alltype, to be ptrdiff_t). The second is that the last statement (subspan(s,1,2,3)) fails to compile. But this here works:

#include <iostream>
#include <mdspan>

using namespace std::experimental::fundamentals_v3;

int main(int argc, char *argv[]) {
  /* Example from D0009r8 */
  // Create a mapping
  typedef extents<3,dynamic_extent,7> Extents3D;
  layout_right::template mapping<Extents3D> map_right(10);

  // Allocate a basic_mdspan
  int* ptr = new int[3*8*10];
  basic_mdspan<int,Extents3D,layout_right> a(ptr,map_right);

  // Initialize the span
  for(int i0=0; i0<a.extent(0); i0++)
    for(int i1=0; i1<a.extent(1); i1++)
      for(int i2=0; i2<a.extent(2); i2++)
        a(i0,i1,i2) = 10000*i0+100*i1+i2;

  // Create Subspan
  auto a_sub = subspan(a,ptrdiff_t(1),std::pair<int,int>(4,6),std::pair<int,int>(1,6));

  // Print values of subspan
  for(int i0=0; i0<a_sub.extent(0); i0++) {
    for(int i1=0; i1<a_sub.extent(1); i1++)
      std::cout << a_sub(i0,i1) << " ";
    std::cout << std::endl;
  }

  /* Output
     10401 10402 10403 10404 10405
     10501 10502 10503 10504 10505
  */

  {
  /* Some other example */
  using space = mdspan<double, 10, 20, 30>;
  double a[10*20*30];
  const space s { a };
  auto t = subspan(s, all, all, all);
  //auto u = subspan(s, ptrdiff_t(1), ptrdiff_t(2), ptrdiff_t(3));
  }
}
crtrott commented 6 years ago

OK the problem with the last subspan statement is that the extents implementation doesn't like rank-0 right now. I will fix that.

crtrott commented 6 years ago

Ok I fixed the rank-0 issues. Only thing left is the issue about only taking ptrdiff_t as integral values.

crtrott commented 6 years ago

Fixed the ptrdiff_t thing too. Now your example should completely work except the fact that in your "other examples" you redefine "a".

crtrott commented 6 years ago

The std::array based constructors for extents and mdspan are now also there. We need to do another careful check, but I think this now does all the things the proposal says it should do and nothing more.

keryell commented 6 years ago

I've tried on my machine an it works on the sample code ! Great work ! :-) So now I am trying with some real code... Stay tuned.

crtrott commented 6 years ago

Just so you know this is a basic implementation tuned for "simplicity" rather than performance. It shouldn't be super terrible if you use it in simple code, but it will use way more inline levels and recursive template instantiation than would be desirable in a big code base. Basically it doesn't have any specialization's for the typical cases. For example the operator of layouts should probably be specialized for the 0-4D cases, instead of using an implementation which is rank agnostic and can thus be used for mdspans of rank 4,000,000,000 (though I guess C++ compilers have a maximum recursion depth for templates which triggers way before that number).

One would probably also want to have a partial specialization for the default accessor mdspan which simply dereferences pointer instead of using acc_.access(p, offset); .

We are planning to add a second implementation which has those optimizations and additionally will be compatible with CUDA and ROCm usage (i.e. there are macros for function markings).

crtrott commented 6 years ago

Another thing on that: what the implementation does is that it mostly adheres to the desired implementation characteristics described in the paper (e.g. static extents are not explicitly stored as members).

keryell commented 6 years ago

It seems to work for me up to now. I am not that worried about mdspan with a rank of 4,000,000,000, because either most of the dimensions are 0 or you will quickly be bothered by the number of particles estimated in the known universe... Only around $10^{80}$ :-)

keryell commented 6 years ago

I do not think we should bother about non standard language extensions such as CUDA or C++AMP. In SYCL there is no issue with using a function defined in the compilation unit either in host or device context because the compiler can infer where the function is to be executed at the end. Some co-authors of the P0009 paper should remove this limitation of CUDA or C++AMP inside their own companies or just use SYCL. :-)

crtrott commented 6 years ago

Yeah sure: Carter and Bryce here you go. You can get that fixed by November right?

If not I guess I will still do this because we need something for our apps which works on all those platforms ...

keryell commented 6 years ago

After 1 week coding with subspan() I am quite happy. So powerful! :-)

For some other issues, I tried with g++-8 instead of clang++-6 and I hit a minor issue I fixed in https://github.com/ORNL/cpp-proposals-pub/pull/65

crtrott commented 6 years ago

Just merged your fix. And its great that you are happy with it so far.

keryell commented 6 years ago

I think I can close this since I have something working wich clang++-6 and g++-8. Thanks a lot for this amazing feature I was waiting for since I was working on an HPF compiler in the 90's! :-)