omnilabNLP / LogicNLI

15 stars 0 forks source link

Model codes & accuracy #1

Open saravananpsg opened 2 years ago

saravananpsg commented 2 years ago

I have tried to replicate the model as mentioned in the paper. "fine-tune the large versions of LMs with the same hidden size (1024) and adopt a two-layer perceptron to predict the logical relation" With inputs as “[CLS] facts rules [SEP] statement [SEP]” for BERT and RoBERTa.

Bert model shows very accuracy of 0.25% and robera base version 0.52%. Also when I added two hidden layers, model gives only 25% accuracy. May I know why there is a huge variation in the accuracy ? Are the large version of roberta would give better results ?

Please share your model codes!

Thank you

frankTian92 commented 1 year ago

We are sorry about that two hyper-parameters are misleadingly provided. You can try to set the learning rate as 0.0001 and the decay of the learning rate as 1.0.

Code of the model:

`class BertTaker(torch.nn.Module): def init(self, in_dim = 1024, out_dim = 2): super(BertTaker, self).init() self.bert = BertModel.from_pretrained('bert-large-uncased') self.bert.train() self.dropout = torch.nn.Dropout(0.5) self.probe = torch.nn.Sequential(torch.nn.Linear(in_dim,int(in_dim/4)),torch.nn.ReLU(),torch.nn.Linear(int(in_dim/4),out_dim)) torch.nn.init.xaviernormal(self.probe[0].weight) torch.nn.init.uniform_(self.probe[0].bias,-0.2,0.2) torch.nn.init.xaviernormal(self.probe[2].weight) torch.nn.init.uniform_(self.probe[2].bias,-0.2,0.2)

def forward(self, batch):
    (x,m,s),_,_ = batch

    x_bert = self.bert(input_ids = x, attention_mask = m)[0]
    x_emb = x_bert[:,0,:]

    x_emb = self.dropout(x_emb)
    pred = self.probe(x_emb)        
    return pred     `