eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
936 stars 96 forks source link

The "evaluate" method of InMemoryEstimator gets stuck indefinitely #141

Closed Spiess closed 5 years ago

Spiess commented 5 years ago

When I try to run the evaluate method of the InMemoryEvaluator it does not return, even after several minutes. This is also the case for very small evaluation datasets and specified numbers of steps.

eaplatanios commented 5 years ago

Sorry I was caught up with a paper deadline. I would need a code sample that reproduces the error, but one possible reason may be that you include Dataset.repeat() in your input data pipeline to the evaluate method and so the dataset keeps repeating forever. Could you check if that is the issue? If not, could you post a minimal code sample that reproduces the error?

Spiess commented 5 years ago

Here's a code sample which produces the problem for me:

import org.platanios.tensorflow.api.core.Shape
import org.platanios.tensorflow.api.implicits.helpers.{OutputStructure, OutputToDataType, OutputToShape}
import org.platanios.tensorflow.api.learn.Model
import org.platanios.tensorflow.api.learn.estimators.InMemoryEstimator
import org.platanios.tensorflow.api.learn.layers._
import org.platanios.tensorflow.api.ops.training.optimizers.AdaGrad
import org.platanios.tensorflow.api.ops.variables.{GlorotNormalInitializer, ZerosInitializer}
import org.platanios.tensorflow.api.tensors.Tensor
import org.platanios.tensorflow.api.{FLOAT32, INT64, Output, tf}

object EvaluatorExample {

  def main(args: Array[String]): Unit = {
    implicit val evOutputStructureFloatLong = OutputStructure[(Output[Float], Output[Long])]
    implicit val evOutputToDataTypeFloatLong = OutputToDataType[(Output[Float], Output[Long])]
    implicit val evOutputToShapeFloatLong = OutputToShape[(Output[Float], Output[Long])]

    val testDataTensor = Tensor(0 until 28 * 28 * 200).reshape(Shape(-1, 28, 28))
    val testLabelsTensor = Tensor(0 until 200).reshape(Shape(-1))

    val testImages = tf.data.datasetFromTensorSlices(testDataTensor).map(_.toFloat)
    val testLabels = tf.data.datasetFromTensorSlices(testLabelsTensor).map(_.toLong)

    val evalTestData = testImages.zip(testLabels).batch(10).prefetch(10)

    val input = Input(FLOAT32, Shape(-1, 28, 28))
    val trainInput = Input(INT64, Shape(-1))

    val weightsInitializer = GlorotNormalInitializer()
    val biasInitializer = ZerosInitializer

    val network = Flatten[Float]("Input/Flatten") >>
      Linear[Float]("Layer_0/Linear", 512, weightsInitializer = weightsInitializer, biasInitializer = biasInitializer) >>
      ReLU[Float]("Layer_0/Activation") >>
      Dropout[Float]("Dropout", 0.5f) >>
      Linear[Float]("OutputLayer/Linear", 10, weightsInitializer = weightsInitializer, biasInitializer = biasInitializer)

    val loss = SparseSoftmaxCrossEntropy[Float, Long, Float]("Loss/SparseSoftmaxCrossEntropy") >> Mean("Loss/Mean")
    val optimizer = AdaGrad(0.01f)

    val model = Model.simpleSupervised(input, trainInput, network, loss, optimizer)

    val accMetric = tf.metrics.MapMetric(
      (v: (Output[Float], (Output[Float], Output[Long]))) => (tf.argmax(v._1, -1, INT64).toFloat, v._2._2.toFloat), tf.metrics.Accuracy("Accuracy"))

    val estimator = InMemoryEstimator(
      modelFunction = model
    )

    estimator.evaluate(() => evalTestData, Seq(accMetric))
  }
}

I have tested this on macOS and Linux.

eaplatanios commented 5 years ago

The problem is that when you use the InMemoryEstimator, instead of the FileBasedEstimator, all evaluation metrics that will be used need to be provided in the estimator constructor as follows:

val estimator = InMemoryEstimator(
      modelFunction = model,
      evaluationMetrics = Seq(accMetric))

estimator.evaluate(() => evalTestData, Seq(accMetric), saveSummaries = false)

I'm sorry for not documenting this properly, or making it more explicit in the API. I can see how it can create confusion and will try to change it such that it's more explicit.

Also note the saveSummaries = false above. This is necessary because you have not provided a summaries directory.

eaplatanios commented 5 years ago

Please let me know if the issue persists and I'll reopen it.