Closed ESI-SYD closed 1 year ago
@ESI-SYD could you try to make the issue more specific to a type of issue? For instance yesterday you filed one issue per model which is typically OK, but since they each have the same stack trace I collapsed them into one https://github.com/pytorch/torchdynamo/issues/1975. For this issue it'd be good to understand if it's related to https://github.com/pytorch/torchdynamo/issues/1975 or if there are other models also failing. I would try to organize the issues around a common type of error message or stacktrace pattern, and if they seem substantially different, file separate issues. For this one, if you could add specifics of the failures that'd be great.
@ESI-SYD could you try to make the issue more specific to a type of issue? For instance yesterday you filed one issue per model which is typically OK, but since they each have the same stack trace I collapsed them into one pytorch/torchdynamo#1975. For this issue it'd be good to understand if it's related to pytorch/torchdynamo#1975 or if there are other models also failing. I would try to organize the issues around a common type of error message or stacktrace pattern, and if they seem substantially different, file separate issues. For this one, if you could add specifics of the failures that'd be great.
Thanks @wconstab . We will make the issue reporting more specific and informative.
This issue is caused by https://github.com/pytorch/pytorch/pull/90039, there has a simple test case that can reproduce this issue:
import torch
from transformers import AlbertForMaskedLM, AlbertTokenizer, AutoConfig,AutoModelForMaskedLM
import torch._dynamo
import torch.fx.experimental.optimization as optimization
import copy
from typing import Dict, List, Optional
import time
import torch.profiler as profiler
from torch.fx import symbolic_trace
from torch._inductor import config
config.debug = True
#torch.manual_seed(2020)
#torch._dynamo.config.verbose=True
#torch._dynamo.config.suppress_errors = True
import math
config = AutoConfig.from_pretrained("albert-base-v2")
input_size = (8, 512)
model = AutoModelForMaskedLM.from_config(AutoConfig.from_pretrained("albert-base-v2")).eval()
input_ids = torch.randint(0, config.vocab_size, input_size)
decoder_ids = torch.randint(0, config.vocab_size, input_size)
#inputs = {"input_ids": input_ids, "labels": decoder_ids}
inputs = {"input_ids": input_ids, "hidden_states": torch.randn(8, 512, 768)}
class Model(torch.nn.Module):
def __init__(self):
super(Model, self).__init__()
self.attension = model.albert.encoder.albert_layer_groups[0].albert_layers[0].attention
def forward(
self,
input_ids: Optional[torch.LongTensor] = None,
attention_mask: Optional[torch.FloatTensor] = None,
hidden_states: Optional[torch.FloatTensor] = None):
if input_ids is not None:
input_shape = input_ids.size()
else:
raise ValueError("You have to specify either input_ids or inputs_embeds")
if attention_mask is None:
attention_mask = torch.ones(input_shape)
extended_attention_mask = attention_mask.unsqueeze(1).unsqueeze(2)
extended_attention_mask = extended_attention_mask # fp16 compatibility
extended_attention_mask = (1.0 - extended_attention_mask) * torch.finfo(torch.float32).min
mixed_query_layer = self.attension.query(hidden_states)
mixed_key_layer = self.attension.key(hidden_states)
query_layer = self.attension.transpose_for_scores(mixed_query_layer)
key_layer = self.attension.transpose_for_scores(mixed_key_layer)
# Take the dot product between "query" and "key" to get the raw attention scores.
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
attention_scores = attention_scores / math.sqrt(self.attension.attention_head_size)
attention_scores = attention_scores + extended_attention_mask
return attention_scores
ref_model = Model().eval()
ref = ref_model(**inputs)
with torch.no_grad():
opt_model = torch._dynamo.optimize('inductor')(ref_model)
with torch.no_grad():
for i in range(2):
y1 = opt_model(**inputs)
Thanks for your suggestion @wconstab , This issue is not related to pytorch/torchdynamo#1975 , I have edited this issue for more info.
In WW50.4 TorchInductor CPU Performance Dashboard, we observed low passrate of model bench.
SW information
Error log: (See similar error in some models, take attention_is_all_you_need_pytorch as an example here)