eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
939 stars 95 forks source link

Is higher order ops like fold and map_fn implemented? #143

Closed doofin closed 3 years ago

eaplatanios commented 5 years ago

Yes, for some. TF Scala already supports TensorFlow functions and they are used for some higher order ops, such as Dataset.map and Dataset.filter. There is no support for fold and mapFn currently, but it can be easily added since all building blocks are there. If there is strong interest and good use cases, I can add support very soon.

doofin commented 5 years ago

It seems that Dataset.map and Dataset.filter do not operate on Tensors ? The closest I found is whileLoop which can used to implement those HOFs. These higher order functions are important because there are important for so called "differentiable programming" which is the underlying theory of supervised learning (but not reinforcement learning ),and with them we can easily combines , for example, LSTM and CNN with CTC together just like ordinary functional programming! As someone who is not familiar with non-functional programming , thank you so much for this great library!

eaplatanios commented 5 years ago

Yes, the dataset ops operate on datasets over nested structures of tensors, but you can use whileLoop to implement HOFs like the ones you describe. In fact, that's how I was going to implement them, so if you're up for it, I'd be very happy to welcome your contribution and help you with any questions you may have. :) And thank you...I'm actually very happy to hear that you're finding this library useful and to see that people are using it. :)

DirkToewe commented 5 years ago

As temporary solution: A map_fn implementation would look something like:

def map_fn[I: TF,O: TF](
  tensor: Output[I],
  fn: (Output[I], Output[Int]) => Output[O]
): Output[O] =
{
  val len: Output[Int] = (tf shape tensor) apply 0
  val input = TensorArray.create[I](len) unstack tensor

  tf.whileLoop[(TensorArray[O],Output[Int]),(Shape,Shape)](
    predicateFn = _._2 < len,
    bodyFn = {
      case (res, i) => (
        res write ( i, fn(input read i, i) ),
        i+1
      )
    },
    loopVariables = ( TensorArray.create[O](len), 0 )
  ) match {
    case (result,_) => result.stack()
  }
}

// usage example
val in = tf.placeholder[Float]( Shape(-1,-1) )
val out = map_fn[Float,Float](in, (row,i) => row + (i*10+10).castTo[Float] )

val sess = Session()
try {
  val in_feed: Tensor[Float] = Tensor(
    Array(1,2,3),
    Array(4,5,6)
  )
  val result = sess.run( FeedMap( in -> in_feed ), out )
  println( result.summarize() )
} finally {
  sess.close()
}

The problem is the Tensorflow4Python implementation of map_fn is so convoluted and does weird stuff for name scoping and synchronization that I don't feel comfortable enough with to convert it to Tensorflow4Scala properly and file a PR.

Hope this still helps

DirkToewe commented 5 years ago

It would be pretty cool if Output could be used in Scala's for-comprehension, something like:

val newTensor: Output[Double]
  = for( row <- myTensor;
         if tf.norm(row) > 2;
         x <- row )
      yield 2*x

As soon as I get around to it, I'm gonna try and see if that's at all possible.

EDIT: That absolutely works, see gist.