运行export_llama.py显示 if output_attentions:就停止运行了

你好，我在运行第一步导出onnx的时候，出现了下面的警告，并且到 if output_attentions就停止运行了，我想知道是不是torch版本的影响，如果可以的话，能不能分享一下导出的onnx文件，非常感谢！
(ascend2) usr@il003:/mnt/nvme1/usr/ascend-llm/export_llama$ python export_llama.py 
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/modeling_attn_mask_utils.py:94: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (input_shape[-1] > 1 or self.sliding_window is not None) and self.is_causal:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/modeling_attn_mask_utils.py:137: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if past_key_values_length > 0:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:912: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  all_self_attns = () if output_attentions else None
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:913: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  next_decoder_cache = [] if use_cache else None
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:384: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  out_key_value = (key_states, value_states) if use_cache else None
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:397: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:404: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:414: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz, self.num_heads, q_len, self.head_dim):
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:431: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if not output_attentions:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:696: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if output_attentions:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:699: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if use_cache:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:943: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if use_cache:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:944: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  key_values= layer_outputs[2 if output_attentions else 1]
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:948: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  next_decoder_cache.extend(layer_outputs[2 if output_attentions else 1])
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:950: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if output_attentions:
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:959: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  next_cache = torch.concat(next_decoder_cache).reshape(len(self.layers),2,*next_decoder_cache[0].shape) if use_cache else None
/home/usr/anaconda3/envs/ascend2/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:960: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if output_attentions:
yinghuo302 / ascend-llm

运行export_llama.py显示 if output_attentions:就停止运行了 #2