mratsim / laser

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Apache License 2.0
281 stars 15 forks source link

Fused assignation shortcut #18

Open mratsim opened 5 years ago

mratsim commented 5 years ago

Currently the way to implement fast sigmoid would be:

var x = randomTensor([1000, 1000], 1.0)
var output = newTensor[float64](x.shape)
forEach o in output, xi in x:
  o = 1 / (1 + exp(-x))

which is quite wordy.

Reusing the Arraymancer syntax for broacasting would be:

let output = 1 ./ (1 .+ exp(-x))

but this would allocate for:

Unfortunately we cannot use anything over than = in a let/var statement like let x .= 1 / (1 + exp(-x)) But we can use let x = fuse: 1 / (1 + exp(-x)) to request the code to generate forEach automatically.