Open sky712345678 opened 2 years ago
@gadagashwini I was able to replicate the issue on colab, please find the gist here. Thank you!
Hi @sky712345678,
W tensorflow/core/common_runtime/forward_type_inference.cc:231] Type inference failed.
is just a warning, you can safely ignore it. Given code executed without any error message. Thank you!
@gadagashwini what's the point of a warning if the response is simply you can safely ignore it.
? It's clearly there for a reason
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.
@gadagashwini can you talk a little bit more about the reason why we can safely ignore it? Thank you!
@sky712345678 This looks like an issue from tensorflow. Can you please create this issue in tensorflow/tensorflow. Thank you!!
@gowthamkpr Well, the problem was first reported in tensorflow as https://github.com/tensorflow/tensorflow/issues/57052, but the guys there told the reporter to instead post an issue here.
If you know any more details (why it is a TensorFlow issue), could you please provide more details that we can give to the TF guys?
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.
@gowthamkpr The issue was originally a TF issue, but we were redirected to post it here. If you know any more details (why it is a TF issue and not Keras), could you please provide more details that we can give to the TF guys? Thanks!
Unsubscribe
---Original--- From: "Milan @.> Date: Sun, Oct 2, 2022 16:54 PM To: @.>; Cc: @.***>; Subject: Re: [keras-team/keras] Type inference failed. This indicates aninvalid graph that escaped type checking. Error message: INVALID_ARGUMENT(Issue keras-team/tf-keras#103)
@gowthamkpr Well, the problem was first reported in tensorflow as tensorflow/tensorflow#57052, but the guys there told the reporter to instead post an issue here.
If you know any more details (why it is a TensorFlow issue), could you please provide more details that we can give to the TF guys?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
The issue is at the level of the dice_loss
. Can you try producing a reproduction script that only involves the loss function? Maybe just try to backprop through the loss function and see what happens.
I think this should be reproducible without involving any Keras logic, at which point the TF folks will definitely look at it. But anyway, as said before, this is just a warning, not something critical. You can ignore it.
Ok, I got it. Thank you!
I wasn't sure how to reproduce it only involving the loss function, this is my try: https://colab.research.google.com/drive/1qxamrOaOqfVANzMnN-u--Sue4iPtJCtf?usp=sharing Running this Colab notebook, I didn't see the error message in runtime logs.
Ok, I got it. Thank you!
I wasn't sure how to reproduce it only involving the loss function, this is my try: https://colab.research.google.com/drive/1qxamrOaOqfVANzMnN-u--Sue4iPtJCtf?usp=sharing Running this Colab notebook, I didn't see the error message in runtime logs.
Hi bro, Can you share the solution to over this issue? I have met the similar problem. Thank you so much
I'm getting the same warning with TF 2.11 when I set mask_zero=True in the embedding layer.
+1;
I'm also getting the same warning with TF 2.11 and setting mask_zero=True
in the embedding layer while training on GPU. After the warning, the model keeps training and is then saved, but the saved model can't be loaded using keras.models.load_model
.
However, when I'm training on CPU (even with mask_zero=True
) everything works fine and the warning doesn't show up; the model is trained, saved and can be loaded and used again without encountering any problem.
I'm getting something very similar but with pure TF 2.11 on Mac M1. So I really think this is a pure TF issue, and we should reopen the TF issue (https://github.com/tensorflow/tensorflow/issues/57052).
I have the same issue unfortunately. Currently running with mask_zero set to True and using CPU without issue.
Hi @sky712345678,
W tensorflow/core/common_runtime/forward_type_inference.cc:231] Type inference failed.
is just a warning, you can safely ignore it. Given code executed without any error message. Thank you!
Nope - because the execution time 8-folds!
Hi, on TF 2.13.0 I get this warning as well when training a simple encoder-decoder EN-ES translation with LSTM accepting embedded strings with mask_zero=True:
2023-10-07 11:39:56.995271: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_INT32 } } } is neither a subtype nor a supertype of the combined inputs preceding it: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_FLOAT } } }
for Tuple type infernce function 0
while inferring type of node 'cond_40/output/_23'
The model trains, but then when I wanted to use jit_compile=True the fit() breaks with:
2023-10-07 11:46:15.327751: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at xla_ops.cc:444 : INVALID_ARGUMENT: Trying to access resource 7590 located in device /job:localhost/replica:0/task:0/device:CPU:0 from device /job:localhost/replica:0/task:0/device:GPU:0
Someone on StackExchange suggested that this JIT failure might be linked to TF code creating something as INT32 instead of FLOAT32 and resulting in putting some variables in CPU, which seems to be linked to the above motioned error.
Still getting this error has there been any update?
Epoch 1/20
2024-06-18 14:04:10.665333: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
type_id: TFT_PRODUCT
args {
type_id: TFT_TENSOR
args {
type_id: TFT_INT32
}
}
}
is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
type_id: TFT_PRODUCT
args {
type_id: TFT_TENSOR
args {
type_id: TFT_FLOAT
}
}
}
while inferring type of node 'cond_42/output/_24'
2024-06-18 14:04:10.835800: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:630] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
I have solved a similar error in my code and here's how I did it.
I think the problem lies when using @tf.function
or using any function with a condition while running tf graph. In my case during model.fit method.
Problem indicates that invalid graph escaped type checking. When using if-else statement in @tf.function
code keras API converts if-else conditions into tf.cond
(AutoGraph converts if-statement to tf.cond().) however, during model.fit() tensorflow gives a warning when using elif
but if you want to avoid that error remove elif
statements with normal if-else statements and I think that might solve this problem.
Implementation of function before error and it was used in loss function which was used in mode.compile
and later model.fit
method
import tensorflow as tf
class RescaleImage():
def __init__(self) -> None:
super().__init__()
@tf.function
def normalize(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
min_val = tf.cast(min_val,tf.float32)
max_val = tf.cast(max_val, tf.float32)
if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
if min_val==0.0 and max_val==1.0:
x = x/255.0
elif min_val==-1.0 and max_val==1.0:
x = (x - 127.5)/127.5
elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
if min_val==0.0 and max_val==1.0:
x = (x+1.0)/2.0
elif min_val==0.0 and max_val==255.0:
x = (x+1.0)*255.0/2.0
elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
if min_val==-1.0 and max_val==1.0:
x = (x-0.5)/0.5
elif min_val==0.0 and max_val==255.0:
x = x*255.0
return x
@tf.function
def normalize_individual(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
min_val = tf.cast(min_val,tf.float32)
max_val = tf.cast(max_val, tf.float32)
if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
factor = (max_val-min_val)/(tf.math.reduce_max(x)-tf.math.reduce_min(x))
x = factor*(x - tf.math.reduce_min(x))+min_val
elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
if min_val==0.0 and max_val==1.0:
x = (x+1.0)/2.0
elif min_val==0.0 and max_val==255.0:
x = (x+1.0)*255.0/2.0
elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
if min_val==-1.0 and max_val==1.0:
x = (x-0.5)/0.5
elif min_val==0.0 and max_val==255.0:
x = x*255.0
return x
Code after solving the error (using normal if statements):
import tensorflow as tf
class RescaleImage():
def __init__(self) -> None:
super().__init__()
@tf.function
def normalize(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
min_val = tf.cast(min_val,tf.float32)
max_val = tf.cast(max_val, tf.float32)
if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
if min_val==0.0 and max_val==1.0:
x = x/255.0
if min_val==-1.0 and max_val==1.0:
x = (x - 127.5)/127.5
if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
if min_val==0.0 and max_val==1.0:
x = (x+1.0)/2.0
if min_val==0.0 and max_val==255.0:
x = (x+1.0)*255.0/2.0
if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
if min_val==-1.0 and max_val==1.0:
x = (x-0.5)/0.5
if min_val==0.0 and max_val==255.0:
x = x*255.0
return x
@tf.function
def normalize_individual(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
min_val = tf.cast(min_val,tf.float32)
max_val = tf.cast(max_val, tf.float32)
if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
factor = (max_val-min_val)/(tf.math.reduce_max(x)-tf.math.reduce_min(x))
x = factor*(x - tf.math.reduce_min(x))+min_val
if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
if min_val==0.0 and max_val==1.0:
x = (x+1.0)/2.0
if min_val==0.0 and max_val==255.0:
x = (x+1.0)*255.0/2.0
if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
if min_val==-1.0 and max_val==1.0:
x = (x-0.5)/0.5
if min_val==0.0 and max_val==255.0:
x = x*255.0
return x
System information.
Describe the problem. (Continue the issue from tensorflow_issue_57052) I got a
Type inference failed
error when runningtf.keras.Model.fit()
in Tensorflow 2.9 and Keras 2.9. I didn't see this kind of error in version 2.8 with the identical code. Although the program didn't crash, I'm afraid that there will be some error in the trained model.Describe the current behavior. Run
tf.keras.Model.fit()
and the errorType inference failed
shows up.Describe the expected behavior. The error shouldn't show up.
Contributing.
Standalone code to reproduce the issue. Link to notebook: https://drive.google.com/file/d/1k78lpGVthB7nthEkYgUs3JNJTuR79r5E/view?usp=sharing To reproduce:
Source code / logs.