eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
936 stars 95 forks source link

createWithNameScope does not work #78

Closed rikvdkleij closed 6 years ago

rikvdkleij commented 6 years ago

Exception:

Exception in thread "main" org.platanios.tensorflow.api.core.package$exception$ShapeMismatchException: Trying to share variable 'weights', but the specified shape '[3, 4]' is not compatible with the existing variable shape '[4, 3]'.

when I use this:

    val encoded  = tf.createWithNameScope("encode") {
      val weights = tf.variable(name = "weights", initializer = RandomNormalInitializer(), shape = Shape(inputDim, hiddenDim), dataType = FLOAT32)
      val biases = tf.variable(name = "biases", dataType = tf.zeros(dataType = FLOAT32, shape = Shape(hiddenDim)).dataType, shape = Shape(hiddenDim))
      tf.tanh(tf.matmul(inputLayer, weights) + biases)
    }

    val decoded  = tf.createWithNameScope("decode") {
      val weights = tf.variable(name = "weights", initializer = RandomNormalInitializer(), shape = Shape(hiddenDim, inputDim), dataType = FLOAT32)
      val biases = tf.variable(name = "biases", dataType = tf.zeros(dataType = FLOAT32, shape = Shape(inputDim)).dataType, shape = Shape(hiddenDim))
      tf.matmul(encoded(), weights) + biases
    }
eaplatanios commented 6 years ago

@rikvdkleij That is actually not a problem. It has to do with the semantics of name scopes and variable scopes. Check this out for a detailed description. I've been meaning to add information in my documentation but haven't gotten to doing to yet. Does using a variable scope resolve your problem?

rikvdkleij commented 6 years ago

So I understand I have to use createWithVariableScope but I can not set the right reuse value. They will all use existing variables. Also setting no reuse value does not work.

It looks like Scala API is working differently than Python API. The above code is working in Python version.

rikvdkleij commented 6 years ago

Btw, are you planning to add more optimizers, like RMSPropOptimizer?

eaplatanios commented 6 years ago

I'm sorry about that. It turns out it was a bug. I fixed it and I'm currently releasing new snapshot artifacts (they should be available pretty soon). The following code now works fine:

val inputDim = 10
val hiddenDim = 100
val inputLayer = tf.placeholder(FLOAT32, Shape(-1, inputDim))

val encoded = tf.createWithVariableScope("encode", reuse = tf.CreateNewVariableOnly) {
  val weights = tf.variable(name = "weights", initializer = RandomNormalInitializer(), dataType = FLOAT32, shape = Shape(inputDim, hiddenDim))
  val biases = tf.variable(name = "biases", initializer = ZerosInitializer, dataType = FLOAT32, shape = Shape(hiddenDim))
  tf.tanh(tf.matmul(inputLayer, weights) + biases)
}

val decoded = tf.createWithVariableScope("decode", reuse = tf.CreateNewVariableOnly) {
  val weights = tf.variable(name = "weights", initializer = RandomNormalInitializer(), dataType = FLOAT32, shape = Shape(hiddenDim, inputDim))
  val biases = tf.variable(name = "biases", initializer = ZerosInitializer, dataType = FLOAT32, shape = Shape(inputDim))
  tf.matmul(encoded, weights) + biases
}

Regarding RMS prop and other optimizers, I only plan to add them on a need basis for my research at this point. It's very easy to add the ones you need yourself and submit a pull request by the way. You can look at my current implementations in Scala and at the equivalent ones in the Python API and you will see that it's pretty easy to add support for others. Most of the work is already done in the optimizer trait that you will need to extend. In either case, any pull requests would be greatly appreciated and I would be more than happy to help you with any issues that come up. :)

rikvdkleij commented 6 years ago

Thanks!!

eaplatanios commented 6 years ago

No problem! Thanks for giving this a try and reporting things that come up! It really helps me a lot with fixing bugs and improving the library. :)