eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
936 stars 95 forks source link

Dropout Operation erroring #83

Closed mandar2812 closed 6 years ago

mandar2812 commented 6 years ago

Its possible that sometime after commit e0691f8e2aed5f796d642a04922e59c3f857f5af the dropout layer stopped working (tested so far on gpu).

2018-02-19 17:22:16.193 [main] INFO  TensorFlow Native - Extracting the 'tensorflow_framework' native library to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow_framework.so.
2018-02-19 17:22:16.353 [main] INFO  TensorFlow Native - Copied 16846624 bytes to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow_framework.so.
2018-02-19 17:22:16.355 [main] INFO  TensorFlow Native - Extracting the 'tensorflow' native library to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow.so.
2018-02-19 17:22:17.296 [main] INFO  TensorFlow Native - Copied 136133504 bytes to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow.so.
2018-02-19 17:22:17.301 [main] INFO  TensorFlow Native - Extracting the 'tensorflow_jni' native library to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow_jni.so.
2018-02-19 17:22:17.307 [main] INFO  TensorFlow Native - Copied 645872 bytes to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow_jni.so.
2018-02-19 17:22:17.409 [main] INFO  TensorFlow Native - Extracting the 'tensorflow_ops' native library to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow_ops.so.
2018-02-19 17:22:17.411 [main] INFO  TensorFlow Native - Copied 78232 bytes to /tmp/tensorflow_scala_native_libraries9058879363418438272/libtensorflow_ops.so.
2018-02-19 17:22:17.422299: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-02-19 17:22:17.733414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1331] Found device 0 with properties: 
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:03:00.0
totalMemory: 11.90GiB freeMemory: 11.70GiB
2018-02-19 17:22:17.733465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Adding visible gpu devices: 0
2018-02-19 17:22:24.099338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-02-19 17:22:24.099406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-02-19 17:22:24.099421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-02-19 17:22:24.099963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11331 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0, compute capability: 6.1)
2018-02-19 17:22:24.327737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Adding visible gpu devices: 0
2018-02-19 17:22:24.327799: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-02-19 17:22:24.327818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-02-19 17:22:24.327839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-02-19 17:22:24.328044: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 370 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:03:00.0, compute capability: 6.1)
java.lang.NoSuchMethodError: org.platanios.tensorflow.api.learn.layers.Dropout$.apply$default$3()Lorg/platanios/tensorflow/api/core/Shape;
  io.github.mandar2812.dynaml.tensorflow.package$dtflearn$.conv2d_unit(package.scala:335)
  io.github.mandar2812.PlasmaML.helios.core.Arch$.<init>(Arch.scala:20)
  io.github.mandar2812.PlasmaML.helios.core.Arch$.<clinit>(Arch.scala)
  io.github.mandar2812.PlasmaML.helios.package$learn$.<init>(package.scala:32)
  io.github.mandar2812.PlasmaML.helios.package$learn$.<clinit>(package.scala)
  ammonite.$sess.cmd0$.main(cmd0.sc:52)
  ammonite.$sess.cmd1$.<init>(cmd1.sc:1)
  ammonite.$sess.cmd1$.<clinit>(cmd1.sc)
eaplatanios commented 6 years ago

@mandar2812 Did you notice that the arguments to that class constructor changed? scaleOutput was added and I think you may not be accounting for that change where you call the constructor.

mandar2812 commented 6 years ago

@eaplatanios I think this error is happening because of difference in binary version of tensorflow-scala used in my code versus the version which was used to compile DynaML. Closing this issue for now.