Open Nick9214 opened 3 years ago
This class (and other ...ForSequenceClassification) is used to get label logits (which can be then transformed into probabilities with softmax). You must tokenize your text with a Tokenizer class instance, then pass the input_ids to your model. If you also pass true labels, the model will return the loss value as well. I recommend to try loading a pretrained (on any task) Longformer, and then fine-tune uninitialized layers (and other layers too) on your classification task. Maybe you can fine-tune uninitialized Longformer and get similar accuracy, but I haven't tried this.
Could someone explain to me what exactly this class does? Is it possible to get the classification output without pretraining? (It takes too long on colab GPU. I need something I can run on that)