Bears-R-Us / arkouda

Arkouda (αρκούδα): Interactive Data Analytics at Supercomputing Scale :bear:
Other
236 stars 87 forks source link

Add more types like ak.uint64 #9

Closed mhmerrill closed 4 years ago

mhmerrill commented 5 years ago

add uint64 or decide (prob not) to use uint64 or int64 only types like uint16, uint32, float32

reuster986 commented 5 years ago

I think adding additional bit-widths to existing types (int and real) will work without implementing any new operator overloads. Consider the following test code that successfully adds a Block int(8) array and a Block int(32) array without any size-aware code in the + operator:

use BlockDist;

const dom = {0..10};
var A: [dom] int(8);
var B: [dom] int(32);
for i in dom {
  A[i] = i: A.eltType;
  B[i] = (i << 20): B.eltType;
}
var Ad = new DistArray(A);
var Bd = new DistArray(B);
writeln("Ad = ", writeTyped(Ad));
writeln("Bd = ", writeTyped(Bd));
var Cd = Ad + Bd;
writeln("Ad + Bd = ", writeTyped(Cd));
var Dd = Bd + Ad;
writeln("Bd + Ad = ", writeTyped(Dd));

record DistArray {
  type etype;
  var dom;
  var arr: [dom] etype;

  proc init(arr: [?D] ?etype) {
    this.etype = etype;
    this.dom = {0..#D.size} dmapped Block(boundingBox={0..#D.size});
    this.arr = arr;
  }
}

proc +(x: DistArray, y: DistArray) {
  var zarr = x.arr + y.arr;
  var z = new DistArray(zarr);
  return z;
}

proc writeTyped(a: DistArray) {
  return "[%s] %s [%s]".format(this.dom.type: string,
                   this.etype: string,
                   this.arr: string);
}

This is the same operator overload pattern arkouda uses for binopvv. I might try this experiment with GenSymEntrys on a new branch and see if I can make it work. If it works similar to above, then I think the only changes to arkouda would be:

mppf commented 5 years ago

Here is a cute little program showing the virtual dispatch way to handle many types. (In some ways it relies on things like Python's radd, but you don't have to handle each type separately. Note you might need some special cases with param-conditionals).

class GenSymEntry {
  proc dispatchOp(op:string, rhs:borrowed GenSymEntry): owned GenSymEntry {
    halt("pure virtual method");
  }
  proc doOp(op:string, lhs): owned GenSymEntry {
    halt("pure virtual method");
  }
}

class SymEntry : GenSymEntry {
  type etype;
  var aD;
  var a: [aD] etype;

  proc init(a:[]) {
    this.etype = a.eltType;
    this.aD = a.domain;
    this.a = a;
  }

  // Evaluates lhs <op> this
  override proc doOp(op:string, lhs:borrowed SymEntry): owned GenSymEntry {
    var rhs:borrowed SymEntry = this;

    // At this point, lhs and rhs have compile-time known types
    // (i.e. they are instantiations of SymEntry, rather than GenSymEntry)
    select op
    {
      when "+" {
        var result = lhs.a + rhs.a;
        return new owned SymEntry(result);
      }
      when "-" {
        var result = lhs.a - rhs.a;
        return new owned SymEntry(result);
      }
    }
    return nil;
  }

  // Evaluate this <op> rhs
  // through a double-dispatch to get the concrete type available
  override proc dispatchOp(op:string,
                           rhs:borrowed GenSymEntry): owned GenSymEntry {
    return rhs.doOp(op, this); // pass concrete "this", dispatch on other
  }
}

var ones: owned GenSymEntry = new owned SymEntry([1,1,1,1]);
var nums: owned GenSymEntry = new owned SymEntry([1,2,3,4]);
var rls:  owned GenSymEntry  = new owned SymEntry([0.1, 0.2, 0.3, 0.4]);

writeln(ones.dispatchOp("+", nums));
writeln(nums.dispatchOp("-", ones));
writeln(rls.dispatchOp("+", ones));

Thanks to Paul Cassella (who doesn't have access to this repo AFAIK) for some of the idea here.

In this program, the compiler automatically does the "stamping out" of the doOp function for the different array types and the different combinations of them. For non-binary operators, the double-dispatch would not be necessary (and single dispatch probably suffices). Also, this program doesn't handle the case of vector + scalar, but perhaps the best way to handle that is with a generic dispatchOpScalar (e.g.).