I run the code in " gts", but I find a mistake when using the Roberta which in "contextual_embeddings.py ".
`
class RobertaEncoder(nn.Module):
def init(self, roberta_model = 'roberta-base', device = 'cuda:0 ', freeze_roberta = False):
super(RobertaEncoder, self).init()
self.roberta_layer = RobertaModel.from_pretrained(roberta_model)
self.roberta_tokenizer = RobertaTokenizer.from_pretrained(roberta_model)
self.device = device
if freeze_roberta:
for p in self.roberta_layer.parameters():
p.requires_grad = False
def robertify_input(self, sentences):
'''
Preprocess the input sentences using roberta tokenizer and converts them to a torch tensor containing token ids
'''
# Tokenize the input sentences for feeding into RoBERTa
all_tokens = [['<s>'] + self.roberta_tokenizer.tokenize(sentence) + ['</s>'] for sentence in sentences]
# Pad all the sentences to a maximum length
input_lengths = [len(tokens) for tokens in all_tokens]
max_length = max(input_lengths)
padded_tokens = [tokens + ['<pad>' for _ in range(max_length - len(tokens))] for tokens in all_tokens]
# Convert tokens to token ids
token_ids = torch.tensor([self.roberta_tokenizer.convert_tokens_to_ids(tokens) for tokens in padded_tokens]).to(self.device)
# Obtain attention masks
pad_token = self.roberta_tokenizer.convert_tokens_to_ids('<pad>')
attn_masks = (token_ids != pad_token).long()
return token_ids, attn_masks, input_lengths
def forward(self, sentences):
'''
Feed the batch of sentences to a RoBERTa encoder to obtain contextualized representations of each token
'''
# Preprocess sentences
token_ids, attn_masks, input_lengths = self.robertify_input(sentences)
# Feed through RoBERTa
cont_reps, _ = self.roberta_layer(token_ids, attention_mask = attn_masks)
return cont_reps, input_lengths
`
In order to use the Roberta to initialize the weight, we have to use the tokenizer to tokenize the math word problem again. When we tokenize the sentence, the number position will be changed as well. we need to reindex the number position at the same time, otherwise ,we wold get the wrong number embedding. But I don't find this part in your code. Is there any part that I have ignore?
I run the code in " gts", but I find a mistake when using the Roberta which in "contextual_embeddings.py ". ` class RobertaEncoder(nn.Module): def init(self, roberta_model = 'roberta-base', device = 'cuda:0 ', freeze_roberta = False): super(RobertaEncoder, self).init() self.roberta_layer = RobertaModel.from_pretrained(roberta_model) self.roberta_tokenizer = RobertaTokenizer.from_pretrained(roberta_model) self.device = device
` In order to use the Roberta to initialize the weight, we have to use the tokenizer to tokenize the math word problem again. When we tokenize the sentence, the number position will be changed as well. we need to reindex the number position at the same time, otherwise ,we wold get the wrong number embedding. But I don't find this part in your code. Is there any part that I have ignore?