Closed NielsRogge closed 4 years ago
The output of the TabFact models is a simple classification layer.
That's the equation in Section 2 of Understanding tables with intermediate pre-training. In the code it's happening in compute_classification_logits.
Yes so as I understand: first the hidden representation of the [CLS] token is converted into another vector of size 768 using the pooling layer (whose weights were pretrained), and then this vector is multiplied by output_weights_cls
and the output_bias_cls
is added as seen in compute_classification_logits
.
In other words, output_weights
and output_bias
(even though they are in the checkpoint as shown above), are not used for Tabfact, right?
Yes, I misunderstood your question, sorry!
You are correct, output_weights
and output_bias
are not used by the TabFact model.
Hey!
As I'm further converting Tensorflow checkpoints to their PyTorch counterpart, I have a question related to the Tabfact checkpoints. When I print out the variables of the Tabfact (base with reset + intermediate pretraining) checkpoint, these are the last ones:
I understand
output_weights_cls
andoutput_bias_cls
are here, however, why areoutput_weights
andoutput_bias
here? Aren't these related to cell selection, which Tabfact does not require?