waylonflinn / weblas

GPU Powered BLAS for Browsers :gem:
MIT License
702 stars 43 forks source link

Basic Pipelining #16

Closed waylonflinn closed 8 years ago

waylonflinn commented 8 years ago

Moving data to and from GPU memory is a large bottleneck. Good performance for target applications (Machine Learning, Neural Networks) will not be had without a mechanism for easily using results of previous calculations, already resident in GPU memory, in subsequent operations.

Here's a potential syntax for this:

var P = weblas.pipeline(),
    O = P.output(); // special pipeline output variable;

var min = 0.0,
    width = 3,
    stride = 2;

// Convolution -> ReLU -> Pool
P.saxpy(m*n, a, A, y)                   // Conv.1
  .sgemm(m, n, k, alpha, O, B, beta, C) // Conv.2
  .max(O, min)                          // ReLU
  .maxPatch(O, width, stride)           // Pool
  .execute(function(err, result){       // async execution
    console.log(result);
  });