Are there any rules when creating a dataset with ChatGPT?

Hi, the paper provides detailed information about the prompt used to create the dataset. For clarity, it is sufficient to include only the description of the behavior tree and the list of available actions. We collected approximately 600 behavior trees from the specified collection (open source robotics projects), resulting in 600 generated samples. While generating behavior trees could increase the number of entries to over 1000, we do not recommend that and we discuss the problems of this approach in the paper. Below we provide the general one-shot scheme to generate a description for each behavior tree in the collection. "instruction" is "You will be provided a summary of a task performed by a behavior tree, and your objective is to express this behavior tree in XML format.", "input" is the generated description and "output" is the original behavior tree.

completion = client.chat.completions.create(
          model="gpt-3.5-turbo-1106",
          messages=[
            {"role": "system", "content": instruction},
            {"role": "user", "content": example_user_prompt},
            {"role": "assistant", "content": example_assistant_output},
            {"role": "user", "content": user_prompt}
          ],
          temperature = 0.3,
          presence_penalty = 1.5,
          max_tokens = 200
        )
        json_format = {
          "instruction": task,
          "input": completion.choices[0].message.content,
          "output": user_prompt
        }

AIRLab-POLIMI / BTGenBot

Are there any rules when creating a dataset with ChatGPT? #9