Negative samples of FEVER pre-training

jacklxc commented 3 years ago

How did you perform fever_scifact training? The paper mentioned pre-training on FEVER then training on SciFact, but the FEVER dataset provided only has positive examples included. How is the rationale selection module pre-trained on FEVER? i.e. how did you train rationale+roberta_large + fever_scifact? Did you sample / retrieve negative FEVER examples? How did you do so?

Thank you.

dwadden commented 3 years ago

I'll double-check on this and get back to you soon.

dwadden commented 3 years ago

If I understand correctly, you're asking how we trained the rationale selector in cases where there was no evidence to be found in a candidate document?

You can see the code to train the FEVER rationale selector here. For FEVER + Scifact, we just took the FEVER-trained model, and then trained further using the scifact rationale training script.

Concerning negative samples, our training loop for FEVER looked like this:

For each claim / evidence document pair (call these c and d):
- For each sentence s in d
  - Predict whether s is a rationale for c, and compute the binary cross-entropy loss against the true label (i.e. whether s is actually a rationale).

The "negative samples" are just the sentences in the evidence document d that aren't actually rationales. Indeed, most sentences in a given evidence document aren't relevant for verifying c. This provides all the negative samples we need.

Does this make sense?

jacklxc commented 3 years ago

I see. Thank you for your response. My proposed solution is to perform a paragraph-level prediction, so I got confused.

dwadden commented 3 years ago

Makes sense. Yes, in that case maybe generate FEVER negatives with tf-idf retrieval or something?

I'll close this.

allenai / scifact

Negative samples of FEVER pre-training #7