AIRLab-POLIMI / BTGenBot

BTGenBot: a system to generate behavior trees for robots using lightweight (~7 billion parameters) large language models (LLMs)
MIT License
52 stars 6 forks source link

Are there any rules when creating a dataset with ChatGPT? #9

Closed sea-hot closed 2 months ago

sea-hot commented 3 months ago

Are there any rules when creating a dataset with ChatGPT? For example, are there directive keywords like place, action, object, or rules for classification? I would appreciate it if you could provide the guidelines for creating a dataset with more than 1000 entries.

RiccardoIzzo commented 3 months ago

Hi, the paper provides detailed information about the prompt used to create the dataset. For clarity, it is sufficient to include only the description of the behavior tree and the list of available actions. We collected approximately 600 behavior trees from the specified collection (open source robotics projects), resulting in 600 generated samples. While generating behavior trees could increase the number of entries to over 1000, we do not recommend that and we discuss the problems of this approach in the paper. Below we provide the general one-shot scheme to generate a description for each behavior tree in the collection. "instruction" is "You will be provided a summary of a task performed by a behavior tree, and your objective is to express this behavior tree in XML format.", "input" is the generated description and "output" is the original behavior tree.

completion = client.chat.completions.create(
          model="gpt-3.5-turbo-1106",
          messages=[
            {"role": "system", "content": instruction},
            {"role": "user", "content": example_user_prompt},
            {"role": "assistant", "content": example_assistant_output},
            {"role": "user", "content": user_prompt}
          ],
          temperature = 0.3,
          presence_penalty = 1.5,
          max_tokens = 200
        )
        json_format = {
          "instruction": task,
          "input": completion.choices[0].message.content,
          "output": user_prompt
        }