wjko2 / Domain-Agnostic-Sentence-Specificity-Prediction

13 stars 10 forks source link

Applying the model #14

Open JoshuaMathias opened 2 years ago

JoshuaMathias commented 2 years ago

@wjko2 How do I apply the model on a single new text? I don't understand the variable names in the arguments.

Here's what I tried:

    with torch.no_grad():    
        model = torch.load(model_filepath)
        model.eval()
        score = model(text)

This is the error I got:

    def _call_impl(self, *input, **kwargs):
        forward_call = (self._slow_forward if torch._C._get_tracing_state() else self.forward)
        # If we don't have any hooks, we want to skip the rest of the logic in
        # this function, and just call forward.
        if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
                or _global_forward_hooks or _global_forward_pre_hooks):
>           return forward_call(*input, **kwargs)
E           TypeError: PDTBNet.forward() missing 1 required positional argument: 'ss'

I see this corresponding code in models.py for the class PDTBNet:

    def forward(self, s1,ss):
        # s1 : (s1, s1_len)
        u = self.encoder(s1)
        #v = self.encoder(s2)
        features = torch.cat((u,self.bn(ss)), 1)
        output = self.classifier(features)
        #output = self.classifier(u)
        return output

So, I can tell that I'm missing a second argument, but these are my questions:

  1. What does ss mean and what is the expected data format for it?
  2. If ss is features related to the text s1, how do I create ss from s1?
  3. Is a sentence expected for s1 or does it expect a tuple (s1, s1_len)?
JoshuaMathias commented 2 years ago

Answer to question 3:

From this error I see that s1 is meant to be a tuple of (text, length of text):

self = BLSTMEncoder(
  (enc_lstm): LSTM(300, 100, num_layers=3, dropout=0.5, bidirectional=True)
)
sent_tuple = 'This is a sentence.'

    def forward(self, sent_tuple):
        # sent_len: [max_len, ..., min_len] (bsize)
        # sent: Variable(seqlen x bsize x worddim)
>       sent, sent_len = sent_tuple
E       ValueError: too many values to unpack (expected 2)
JoshuaMathias commented 2 years ago

It appears that scoring a new sentence involves the following:

  1. Prepare linguistic features on the text:
    1. ModelNewText(0,0,0) - Initialize with stub data.
    2. ModelNewText.loadSentences("new", [sentence])
    3. ModelNewText.fShallow() - Prepare and store in memory features for the sentences loaded.
    4. labels, features = ModelNewText.transformShallow() - Turn the features into a numpy vector.
  2. Load GloVe word vectors.
    word_vectors = {}
    with open(word_vectors_path, encoding="utf8") as file:
            for line in file:
                word, vec = line.split(' ', 1)
                if word:
                    word_vectors[word] = np.array(list(map(float, vec.split())))
  3. Prepare a torch vector of the GloVe word vectors for the sentence and a vector of sentence lengths.
    1. sentence_embeddings, sentence_lens = get_batch([sentence], word_vectors, 300)
  4. Prepare Torch Variable for the features:
    1. params.sf = 1 # What is this?
    2. feature_vector = torch.from_numpy(features).float()*params.sf
    3. feature_variable = Variable(feature_vector).cuda()
  5. Provide features to the model forward function:
    1. model = torch.load(model_path)
    2. model.eval()
    3. output = model((sentence_embeddings, sentence_lens), feature_variable)
  6. Calculate a specificity score from the model output:
    1. scores = torch.nn.functional.softmax(output, dim=1)
    2. sentence_score = scores[0]
JoshuaMathias commented 2 years ago

I created a new function that does all the above on a list of sentence strings (the new __call__ function at generatefeatures.py: https://github.com/JoshuaMathias/Domain-Agnostic-Sentence-Specificity-Prediction/blob/master/generatefeatures.py), but I'm stuck at the last part where we get the actual specificity score. The paper describes normalizing ratings from 1-5 to 0-1, but I don't see this reflected in the code. When applying the model using the model's forward function, the output I'm getting is 26292.9473, 3277.1140 and the numbers are exactly the same for every sample text I gave it. If I use softmax like the code does in evaluate(), I get 1, 0, but my understanding is softmax is not useful at all here for getting the actual predicted value. Yet it's the output from softmax that is written in the evaluation code. I'm quite stumped.

@wjko2 How do we get the continuous value specificity score?

JoshuaMathias commented 2 years ago

Update: I just reviewed the code for PDTBNet and found this:

            self.classifier = nn.Sequential(
                nn.Linear(self.inputdim, self.fc_dim),
                nn.Linear(self.fc_dim, self.fc_dim),
                nn.Linear(self.fc_dim, self.n_classes)
                )

I realized from this that the last row of the model creates a relative continuous value for each class, where the default is set to 2. I used to think the n_classes parameter was just for the supervised training portion, which I guess this is. So now I understand the softmax operation as just normalizing the class values to be within 0 to 1 as a probability, and you use the class with the highest value as the specificity score.

It seems we best have 2 n_classes parameters: One for training data and one for the final output by the unsupervised model. But I only see one place in model output to choose the number of classes, which is the code I quoted above.

In my case I can convert all the training data to be the number of labels I want, which is 1 through 4 and have that be a consistent setting throughout, but since a main stated goal of the paper was to use data with 2 classes to train a model to produce a continuous value that could be used for more fine-grained ratings, I must be missing something. I think what I'm missing is understanding what in the code is supervised training vs unsupervised training.