timoschick / dino

This repository contains the code for "Generating Datasets with Pretrained Language Models".
https://arxiv.org/abs/2104.07540
Apache License 2.0
187 stars 24 forks source link

Fix sts-x2.json sample task #3

Closed AleksanderObuchowski closed 3 years ago

AleksanderObuchowski commented 3 years ago

When I try to run sts-x2.json sample task I get en error:

AssertionError: Invalid task specification: counter_label '2' for label '0.5' is not a label

I think the labels should be changed to :

{
  "task_name": "sts",
  "labels": {
    "2": {
      "instruction": "Task: Write two sentences that mean the same thing.\nSentence 1: \"<X1>\"\nSentence 2: \"",
      "counter_labels": []
    },
    "1": {
      "instruction": "Task: Write two sentences that are somewhat similar.\nSentence 1: \"<X1>\"\nSentence 2: \"",
      "counter_labels": [
        "2"
      ]
    },
    "0": {
      "instruction": "Task: Write two sentences that are on completely different topics.\nSentence 1: \"<X1>\"\nSentence 2: \"",
      "counter_labels": [
        "2",
        "1"
      ]
    }
  }
}

Instead of

{
  "task_name": "sts",
  "labels": {
    "1": {
      "instruction": "Task: Write two sentences that mean the same thing.\nSentence 1: \"<X1>\"\nSentence 2: \"",
      "counter_labels": []
    },
    "0.5": {
      "instruction": "Task: Write two sentences that are somewhat similar.\nSentence 1: \"<X1>\"\nSentence 2: \"",
      "counter_labels": [
        "2"
      ]
    },
    "0": {
      "instruction": "Task: Write two sentences that are on completely different topics.\nSentence 1: \"<X1>\"\nSentence 2: \"",
      "counter_labels": [
        "2",
        "1"
      ]
    }
  }
}
timoschick commented 3 years ago

Thanks! I had changed the labels from [0, 1, 2] to [0, 0.5, 1] a while ago because it makes the explanation of label smoothing in the paper a bit clearer, but I forgot to update the counter labels. Should be fixed now.