omnilabNLP / LogicNLI

14 stars 0 forks source link

Failure to reproduce the results #2

Open marksilver6 opened 2 years ago

marksilver6 commented 2 years ago

Hi, I am trying to reproduce the results as reported in the paper. I tried BERT-base model as the backbone, using BertForSequenceClassification provided in HF with num_labels=4. I use "facts rules [sep] statement i" as input to train the model. However, the accuracy is always 25% (the model ouput a constant label for all examples). Is it possible to provide the code to reproduce the paper results?

frankTian92 commented 1 year ago

We are sorry about that two hyper-parameters are misleadingly provided. You can try to set the learning rate as 0.0001 and the decay of the learning rate as 1.0.

Code of the model:

`class BertTaker(torch.nn.Module): def init(self, in_dim = 1024, out_dim = 2): super(BertTaker, self).init() self.bert = BertModel.from_pretrained('bert-large-uncased') self.bert.train() self.dropout = torch.nn.Dropout(0.5) self.probe = torch.nn.Sequential(torch.nn.Linear(in_dim,int(in_dim/4)),torch.nn.ReLU(),torch.nn.Linear(int(in_dim/4),out_dim)) torch.nn.init.xaviernormal(self.probe[0].weight) torch.nn.init.uniform_(self.probe[0].bias,-0.2,0.2) torch.nn.init.xaviernormal(self.probe[2].weight) torch.nn.init.uniform_(self.probe[2].bias,-0.2,0.2)

def forward(self, batch):
    (x,m,s),_,_ = batch

    x_bert = self.bert(input_ids = x, attention_mask = m)[0]
    x_emb = x_bert[:,0,:]

    x_emb = self.dropout(x_emb)
    pred = self.probe(x_emb)        
    return pred     `