LongformerForSequenceClassification explanation

allenai / longformer

Longformer: The Long-Document Transformer

Apache License 2.0

2.05k stars 276 forks source link

This class (and other ...ForSequenceClassification) is used to get label logits (which can be then transformed into probabilities with softmax). You must tokenize your text with a Tokenizer class instance, then pass the input_ids to your model. If you also pass true labels, the model will return the loss value as well. I recommend to try loading a pretrained (on any task) Longformer, and then fine-tune uninitialized layers (and other layers too) on your classification task. Maybe you can fine-tune uninitialized Longformer and get similar accuracy, but I haven't tried this.

allenai / longformer

LongformerForSequenceClassification explanation #207