Enny1991 / PLSTM

137 stars 31 forks source link

Running Error in PLSTM: No gradient defined for operation '3/rnn/while/PhasedLSTMCell/FloorMod_1' #4

Closed wzmsltw closed 7 years ago

wzmsltw commented 7 years ago

Hi Enny1991: I am using tensorflow of version 0.12.head and I encounter an error: LookupError: No gradient defined for operation '3/rnn/while/PhasedLSTMCell/FloorMod_1' (op type: FloorMod) while execute line 202 in simplePhasedLSTM.py ---- grads = tf.gradients(cost, tvars) But I can't find the op FloorMod anywhere. How can I solve this problem? Thanks a lot!

Enny1991 commented 7 years ago

Hi @wzmsltw I cannot reproduce the bug. Are you simply running the script has it is? Did you modify something either in simplePhasedLSTM.py or in PhasdeLSTMCell.py?

wzmsltw commented 7 years ago

Hi @Enny1991 Yeah, I just clone PLSTM from github and running simplePhasedLSTM.py Is this a problem caused by TensorFlow version?

wzmsltw commented 7 years ago

@Enny1991 There is the full python console content:

('Compiling RNN...',)
DONE!
('Compiling cost functions...',)
DONE!
/home/wzmsltw/Document/tensorflow012/tensorflow/_python_build/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wzmsltw/anaconda2/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 714, in runfile
    execfile(filename, namespace)
  File "/home/wzmsltw/anaconda2/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 81, in execfile
    builtins.execfile(filename, *where)
  File "/home/wzmsltw/Project/LSTM/Test/PLSTM/simplePhasedLSTM.py", line 279, in <module>
    tf.app.run()
  File "/home/wzmsltw/Document/tensorflow012/tensorflow/_python_build/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/wzmsltw/Project/LSTM/Test/PLSTM/simplePhasedLSTM.py", line 202, in main
    grads = tf.gradients(cost, tvars)
  File "/home/wzmsltw/Document/tensorflow012/tensorflow/_python_build/tensorflow/python/ops/gradients_impl.py", line 459, in gradients
    (op.name, op.type))
LookupError: No gradient defined for operation '3/rnn/while/PhasedLSTMCell/FloorMod_1' (op type: FloorMod)
Enny1991 commented 7 years ago

Yeah I'm thinking about a version problem, butI tested it with versions 0.11.0, 0.12.0 and 0.12.0rc1 and there are no problems. Did you build tf from source?

wzmsltw commented 7 years ago

Yes, I build tf from source because I need using GPU to accelerate training.

wzmsltw commented 7 years ago

Hi @Enny1991 I found a relative problem issues/6365 about tf.mod. But I found you have register the gradient for the mod operation and you hadn't use the tf.floormod. So do you know why tf.floormod appear while running tf.gradients ? Thanks a lot!

Enny1991 commented 7 years ago

Hi, Still can't reproduce this bug. I'm working on it. By the way is there a reason why you need to build from source instead of using the last stable version on pip? You can still build with GPU support even if you install from pip.

wzmsltw commented 7 years ago

Hi Enny1991 I found if I change the line 46 in PhasedLSTMCell.py from @ops.RegisterGradient("Mod") to @ops.RegisterGradient("FloorMod"), the code can run without error. But I don't know why there is confusion betweentf.mod and tf.floormod . As for installation, I supposed that the GPU version can only be built from source as some tutorial tell me so... I can install from pip next time. And one more question, when I use PLSTM in my own data(sequence classification task), loss and l2 will become NaN after several epoch. I tried to reduce the learning rate, but NaN still appear. Do you have any suggestion about this problem? Thanks~

Enny1991 commented 7 years ago

Hi @wzmsltw, are you using the built-in tf.nn.softmax_cross_entropy_with_logits or are you calculating your own loss? If the latter, it might be that you have to look out for rounding errors, and add eps to prevent calculating log(0). Let me know.

+Enea

Enny1991 commented 7 years ago

@wzmsltw Can I consider this closed?