Closed sebastien closed 5 years ago
The first version of between
could be implemented without a conditional like this:
between(v,a,b) = bit(v < a || v > b);
The bit
function maps a boolean value true
or false
to 1
or 0
. This would be efficient on a GPU.
Note, this requires v
to be a scalar. If this version of the function were vectorized, so that the input v could be a vector, then the result would also be a vector.
I don't like the idea of changing sum
to accept a scalar as an argument. This would introduce an inconsistency with the family of functions that sum
is part of, including product
, max
, min
, union
, etc, and would also be inconsistent with the way that the sum
operator works in array languages generally.
For background, Curv is intended to be an array language. The first array language was APL. There are many array languages today, and there is a standard set of operations and idioms that I want to be consistent with. Array languages work on generalized multidimensional arrays of numbers, which are often called 'tensors'. (Google's TensorFlow is an array language.) A number is a rank 0 tensor. A list of numbers (or vector) is a rank 1 tensor. A matrix is a rank 2 tensor. And so on.
There is a standard array language operation called ravel
or flatten
which converts an arbitrary tensor to a 1D array of numbers. As a special case, ravel(42) returns [42]: it converts a number to a single element list containing that number.
If you want to apply sum
to a value a
, where a
is either a scalar or a vector, and you want the scalar to be treated as a single element vector, then I think the conventional array language approach is to write sum(ravel(a))
.
I am in favour of adding ravel
or flatten
to Curv. The name ravel
is traditional (it comes from APL), and is shorter. The name flatten
is longer, and more well known, and maybe more understandable, but the flatten
operation in languages like Ruby and Scala does not work on scalars. Python's numpy library has a ravel
function that behaves as I described, and numpy is one of the more popular array languages.
Consistency is definitely more important, and adding a ravel
function seems like the best choice as it's rooted in history. Thanks a lot for the context.
I was trying to implement a
between
without conditionals, ie. from:to
but I realized this does not work with
v
as non-vector/list assum(1)
fails. I'm not sure about the state of polymorphic functions in curv, but I'd be happy to submit a patch to the standard library to support that if it doesn't present a performance risk.Note that in practice, I assume that the first implementation is going to be more effective performance-wise... but maybe not! I've heard that conditionals should be avoided on the GPU.