Converting tensor to boolean in Whisper model conversion to onnx raises TraceWarning

On the conversion of HuggingFace whisper model to onnx, I got such a Warning:

/usr/local/lib/python3.8/dist-packages/transformers/models/whisper/modeling_whisper.py:207: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/usr/local/lib/python3.8/dist-packages/transformers/models/whisper/modeling_whisper.py:246: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
/usr/local/lib/python3.8/dist-packages/transformers/models/whisper/modeling_whisper.py:756: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if input_shape[-1] > 1:
/usr/local/lib/python3.8/dist-packages/transformers/models/whisper/modeling_whisper.py:74: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  mask = torch.full((tgt_len, tgt_len), torch.tensor(torch.finfo(dtype).min))
/usr/local/lib/python3.8/dist-packages/transformers/models/whisper/modeling_whisper.py:214: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, tgt_len, src_len):

Is this hurts performance? And do you know how to resolve it?

Hi @hannan72! As I told you in this issue, these warnings don't affect the performance of the model.

They mean that the conditions in the IF statements (for example attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len) in the first warning) won't be evaluated at runtime and are seen as constants. Which may lead to an error in case some of the parameters involved in these conditions change (for example bsz (batch size) could take a different value).

If you want to remove them, you would need to either use scripting, which requires modifying the modeling file of Whisper, or generating an ONNX graph with the operator If. Both options require quite a lot of work for probably no speedup.

huggingface / optimum

Converting tensor to boolean in Whisper model conversion to onnx raises TraceWarning #878