chaojiang06 / wiki-auto

Neural CRF Model for Sentence Alignment in Text Simplification
62 stars 16 forks source link

Predicted sequence contains all 0s #2

Open schan27 opened 4 years ago

schan27 commented 4 years ago

Hi there,

Thank you for making this code available! I am wondering if you might have some insight into why the following example produces a predicted sequence with all 0s - this is using the pretrained BERT_wiki:

from transformers import BertForSequenceClassification, BertTokenizer
from model import NeuralWordAligner
import torch

my_device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
BERT_folder = "/path/to/BERT_wiki"
tokenizer = BertTokenizer.from_pretrained(BERT_folder,
                                            do_lower_case=True)
bert_for_sent_seq_model = BertForSequenceClassification.from_pretrained(BERT_folder,
                                                                        output_hidden_states=True)
model = NeuralWordAligner(bert_for_sent_seq_model=bert_for_sent_seq_model,
                            tokenizer=tokenizer)
bert_for_sent_seq_model = bert_for_sent_seq_model.to(my_device)
model= aligner.to(my_device)
bert_for_sent_seq_model.eval()
model.eval()
sents1 = ["The Local Government Act 1985 was an Act of Parliament in the United Kingdom.", "All of the authorities were controlled by, or came under the control of the opposition Labour Party during Thatcher's first term.", "Its proposals formed the basis of the Local Government Bill."]
sents2 = ["The main provision, section 1 stated that \"the Greater London Council; and the metropolitan county councils\" shall not exist anymore.", "The Local Government Act 1985 was an Act of Parliament in the United Kingdom.", "It came into effect on 1 April 1986."]
_, _, alignment = model(sents1, sents2, None)

The sentences in sents1 and sents2 are from wiki-manual/train.tsv. I would expect sents[0] to be aligned to sents2[1], since they have the label aligned, but here is the output at this point:

output_both: tensor([[ 0.0720, -0.2818,  0.0411,  0.0066],
        [ 0.1266,  0.1336,  0.1619,  0.0066],
        [ 0.1359,  0.1281,  0.1375,  0.0066]], device='cuda:0',
       grad_fn=<SqueezeBackward1>)
transition_matrix: tensor([[-0.0223, -0.4793, -0.4793, -0.4793],
        [-0.3270, -0.7045, -1.0150, -1.3255],
        [-0.3270, -0.3940, -0.7045, -1.0150],
        [-0.3270, -0.7045, -0.3940, -0.7045]], device='cuda:0',
       grad_fn=<ViewBackward>)
len_A: 3
extended_length_B: 3
return_sequence: [0, 0, 0]

Any help would be greatly appreciated! Thanks again.

imurs34 commented 1 year ago

Having the same issue! Have you resolved it?