Closed sjy1203 closed 4 years ago
Wikihop questions are multiple-choice questions so it is a multiclass classification problem not a span prediction task.
Thanks for your quick reply.
I see, so how does Longformer do the multiclass classification problem such as Wikihop?
It's not very clear by saying
WikiHop uses a classification layer for the candidate
Does Longformer do multiple 0/1 classifications by concatenating the [CLS] output of the question and documents with each candidate? Or ...?
We encode the question and each candidate answer choice as [q] question [/q] [ent] candidate1 [/ent] ... [ent] candidateN [/ent].
Then attach a linear layer with single output score (R^1024 -> R^1) to each [ent]
token, concat all scores for all candidates, apply softmax and use cross entropy loss with the correct candidate. More details in appendix B (https://arxiv.org/pdf/2004.05150.pdf).
Thanks
Closing. Please feel free to reopen or create a new issue if you have other questions.
Hi, In your paper, it said
Does it mean Longformer predict answer span on Wikihop same as TriviaQA?