salesforce / DNNC-few-shot-intent

Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
MIT License
46 stars 11 forks source link

Does the order of two sentences matter? #1

Open jind11 opened 3 years ago

jind11 commented 3 years ago

Hi, thank you for providing this source code. I have two questions after reading it:

  1. In the file "train_dnnc.py", line 173 and 174 show that the same pair of sentences are added to the training samples twice by reversing their order. However, I do not think changing the order would matter to the NLI prediction, does it? Besides, in the manuscript, it is written that the total number of positive samples is NK(K-1), where it does not involve reversing the order.
  2. In the file "train_dnnc.py", line 189 shows that the nli_dev set actually overlaps with the nli_train set, so why this nli_dev can detect whether there is overfitting? Thank you!
hassyGo commented 11 months ago

Thank you so much for your interests, and sorry for our late reply.

1. In the original "NLI" task, the order (is expected to) matters, because the task is directional: premise -> hypothesis. Also, the model is order sensitive because of the cross-attention nature.

For the few-shot DNN training, from a viewpoint of data augmentation, we wanted to make the best use of the limited number of training examples. Having both the (A -> B) and (B -> A) directions with two examples A&B, we can simulate both the cases: 1) A is the input, and 2) B is the input.

Regarding the number of the positive training examples, the described one is correct; "K(K-1)" accounts for the order.

2. As in the brief comment, the role of the "NLI validation examples" is to allow us to check if the model is at least fitting to the synthetic/artificial task. We observed that overfitting (i.e., achieving very high accuracy) on the training set is crucial in the few-shot setup, so we decided to just monitor the accuracy on the small subset of the training set. It might not be the best, but it is not realistic to assume an enough amount of separate validation examples.

We also tried to avoid the overlap between "nli_train_examples" and "nli_dev_examples", but I remember that this hurt the model's accuracy, because reducing even a small number of examples is significant in the few-shot setup.

Thanks, Kazuma