eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
936 stars 96 forks source link

Nested arrays to tensor #163

Closed GratefulTony closed 4 years ago

GratefulTony commented 5 years ago

I was having some difficulty finding the best way to create a tensor from an array of arrays.

Originally, I used a simple approach which creates individual tensors for the dimensions, and stacks them using Tensor(Tensor(...), Tensor(...)). This approach was very slow, perhaps due to the allocation of many tensor objects?

I experimented with simply passing my data to a byte buffer and creating a tensor on top of the buffer. This was much faster. To reduce boilerplate and allow some testing, I extracted the behavior. See my implementation here: https://github.com/GratefulTony/tensorflow_scala/blob/master/modules/api/src/main/scala/org/platanios/tensorflow/api/utilities/EasyFastArraysToTensor.scala

The code detects the shape of the incoming nested array structure, and has a typeclass for determining the toByteArray method and byteArraySize method. Using this information and a recursively-generated flat byte buffer, the tensor is constructed.

In my benchmark, this is significantly faster than constructing and stacking individual tensors.

eaplatanios commented 4 years ago

Thanks for the suggestion @GratefulTony and sorry for taking so long to respond. I added a slightly simplified version of this in d827e0d3.