Closed lcukerd closed 6 years ago
Hello! The argmax function has no gradient. Or at least, its gradient is equal to zero all the time. This is not specific to keras. It's the same in all deep learning frameworks because this is the mathematical definition of the gradient of argmax.
If you wish to create your own operation, with a custom gradient, you need to access the backend directly and create a new op. But most of the time, it's not a walk in the park. See https://www.tensorflow.org/extend/adding_an_op
Yes, I know argmax has no gradient. But the error is clearly asking me to define one for argmax. How do I get to fix this error then?
The error message is maybe not clear. It's saying that you should only use backend functions which have a gradient. So something else than argmax. The message is not saying that you should define argmax's gradient. Maybe this message is not explicit enough.
Okay. So is there any alternative for argmax (as my model cannot work without one) that I can use?
Btw, why does backend have argmax function when we can't use it in model?
I don't know any alternative for argmax, I've never worked with a model requiring one.
Argmax is there to perform operations whenever the gradient is not needed. For example, when computing a metric.
I suppose you can try to use the argmax from tensorflow directly and see if you get the error. But you must know what you are doing because if there is no error, it is implied that the gradient is null (like tf.around)
Okay, thanks for helping me out. I will give tensorflow a go.
I will let this issue open for a day and wait for someone who knows an alternative to argmax. I hope no one has problem with this (else they can close it).
Did you find any solution to this? @lcukerd
@MansiAgarwal11 Yes, I did. You will have to use Keras in Tensorflow model. For training, you will have to define a loss function like in this article. In the model shown in the article if you include argmax, it will still work. You should be able to do this using only Keras but I haven't tried yet.
But if there is no gradient for argmax function, how does the model backpropagate?
I am not sure myself but I think the tensorflow code was written to bypass it in a clever way. Probably someone tensorflow team can clear this up? Btw Did your model converge?
I didn't make use of argmax and came up with a different loss function for my problem.
FYI, in my experience with a different tensorflow function that didn't have a gradient, I found that I could run and train the model without any errors, but because there was no gradient, there was no actual learning taking place. It's something to look out for if you try to use argmax.
I have the same problem. There is no any problem for train and evaluation, and Ok for saving the model in H5. However, when loading the saved model, the error message pops up: ValueError: An operation has None
for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
Do you have idea to fix this issue. Otherwise, the model cannot be used for prediction. Thank you.
^ You're saying that you can train a model successfully with argmax? That surprises me. What I was trying to say in my earlier comment is that you can sometime run the training with arguments that don't have a gradient and no errors will be thrown, but your model won't actually get better.
How confident are you that the model you're training is actually getting better as you train it?
I monitored the precision, recall and accuracy while training, the model was getting better. If the model was saved with Keras.save, then the error above appears with Keras.load_model. However, if the model was saved with Keras.mode_to_json and Keras.save_weights, then everything is fine when loading the saved model.
Well, thanks for the update, but you've stumped me. I don't understand: 1) How you're training the model with argmax in the cost function and 2) How to solve the question you're actually asking about loading the weights again.
Sorry I couldn't be more help.
Gumbel-softmax may solve the problem of argmax. http://anotherdatum.com/gumbel-gan.html And this states another way to solve that(I got an error currently). https://stackoverflow.com/questions/46926809/getting-around-tf-argmax-which-is-not-differentiable
Yeah, or SeqGAN-based idea of policy updates: https://arxiv.org/abs/1609.05473
I faced the same problem with GPU. with runtime as None, it seems the problem no longer persists.
I'm facing the same issued , I define new layer in Lambda
ValueError: An operation has None
for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
Can any body help into the matter
I monitored the precision, recall and accuracy while training, the model was getting better. If the model was saved with Keras.save, then the error above appears with Keras.load_model. However, if the model was saved with Keras.mode_to_json and Keras.save_weights, then everything is fine when loading the saved model.
I implemented this solution and it worked for me. This is all you will need:
model_json = model.to_json() with open("model.json", "w") as json_file: json_file.write(model_json)
model.save_weights("model.h5") print("Saved model to disk")
json_file = open('model.json', 'r') loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json)
loaded_model.load_weights("model.h5") print("Loaded model from disk")
I am using
Keras.Backend.armax()
in a gamma layer. The model compiles fine but throws an error during fit().My model:
Model summary for easy visualizing:
I googled for the solution but almost all were about a faulty model. Some recommended to not use functions causing that are causing issues. However, as you can see, I cannot create this model without K.argmax (If you know any other way then do tell me).
Also, how can you even define gradient of argmax! I am guessing its an issue in Keras, if not, pls tell me how to define its gradient.