Closed peastman closed 5 years ago
matrixOf()
. Perhaps an ndArrayOf()
? Personally I like NDArary.create
better but collectionNameOf()
seems to be the kotlin way of doing things.I also intended to add .reshape()
overloads on built-in receiver types like List and Array, to allow going directly to matrix and nd from the built-in 1-d containers.
ndarrayOf()
would work nicely. Note that reshape()
creates a completely new array, so that would be less efficient than using a shape
argument to create it with the right shape from the start.
Of course, assembling the varargs for the call to ndarrayOf()
already involves creating a regular array, so I suppose there isn't really a difference in efficiency one way or the other.
It's even worse than it looks. Behind the scenes, vararg
is very fond of using Arrays.copy
for no discernible reason. Whenever one of us does get around to implementing this, it we may want to benchmark to see whether codegen-ing overloads with fixed parameters (with a vararg
in the mix as a fallback), saves any meaningful time. My intuition is "probably not, since Arrays.copy
is O(n) and we'd only be avoiding it for very small n," but it seems like it's probably worth testing out.
Here's a very simple implementation.
inline fun <reified T> ndArrayOf(vararg elements: T, shape: IntArray? = null): NDArray<T> {
if (shape == null)
return NDArray.createLinear(elements.size, filler = { elements[it] })
else
return NDArray.createLinear(*shape, filler = { elements[it] })
}
If this looks ok, I can create a PR.
I like the idea of ndArrayOf
existing. As you guys noted already, the simple approaches won't yield an optimized solution-- in addition to whats already been mentioned, the array that is allocated for the vararg elements will be boxed (e.g. java.lang.Double[]). However, this is no worse than mat[]
and matrixOf
which cast everything to an object, so I think we can deal with that as a separate issue from this one.
Your implementation looks reasonable overall. Couple of comments:
The current mat[]
and matrixOf
implementations accept Any
and always write out doubles. The idea behind this is that scientists often like to write mat[1, 2, 3.3]
and would like that to "just work" by returning a Matrix<Double>
even though in static languages like Kotlin the above is mixing integers and doubles and is therefore ambiguous. However, ndArrayOf
as stated above behaves differently; for example, ndArrayOf(1,2,3)
returns a NDArray<Int>
. One way or another I think that ndArrayOf
should be consistent with the behavior of matrixOf
, as having them be inconsistent would probably cause confusion. There's a couple of ways we could go here:
1) Give up on auto-casting types for the user entirely, such that mat[1, 2, 3]
returns a Matrix<Int>
and make mat[1, 2, 3.3]
an error (unless we can find some overloads that would default to double for mixed primitive types)
2) Change the behavior of matrixOf
to match ndArrayOf
but leave the mat[..]
DSL to do the auto-casting to double as it does now. This way, matrixOf
and ndArrayOf
are consistent but mat[...]
is a special case (which it already is by virtue of it being a DSL).
Thoughts?
A small nitpick: I'd suggest making the null handling a bit more kotlin idomatic, e.g.
NDArray.createLinear(dims= *shape ?: intArrayOf(elements.size), filler = { elements[it] })
unless we can find some overloads that would default to double for mixed primitive types
To clarify, the issue with
fun matrixOf(vararg elements: Int): Matrix<Int>
fun matrixOf(vararg elements: Double): Matrix<Double>
fun matrixOf(vararg elements: Any): Matrix<Double>
Is deciding how to encode end
so that it can show up in all of them, and not overwrite a useful value like NaN
.
That's a good question. I guess I would lean toward not casting and taking the values as whatever type they're given. So you would have to write 1.0, 2.0, 3.3
instead of 1, 2, 3.3
. That seems more consistent with how Kotlin works, and it also means you can control the data type. Right now it isn't possible to create a matrix of Ints or Floats with this mechanism.
In practice, I suspect speed will usually be irrelevant for these functions. If you're explicitly putting the matrix elements into your source code, it's probably a pretty small matrix. 20 elements yes, a million elements no.
Closed by #73.
Matrix has a convenient builder syntax that makes it easy to create matrices containing arbitrary values, but there's nothing comparable for NDArray (or if there is, I haven't found it). This would be a useful addition. Possibly the same builder syntax could be extended to arrays, though it's not obvious to me how to handle arbitrary data types and more than two dimensions. Other approaches are also possible. For example, there could be a method that takes ordinary arrays as input. That's the method used by NumPy, although the syntax is less convenient in Kotlin.
Another possibility is to pass the elements directly to the creator function.
That would create a 1D array. For more dimensions, you would list the elements in flattened order, then give the shape as an extra argument.
This is similar to the approach used by ND4J.