RoBorregos / home-hri

Human-Robot interaction R&D for RoboCup at Home OPL
GNU General Public License v3.0
4 stars 0 forks source link

Dataset for fine-tuning #11

Closed afr2903 closed 6 months ago

afr2903 commented 6 months ago

After adding the new command generator merged in #10, based on #9, generate a dataset for fine-tuning, following the format from the guide:

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}
afr2903 commented 6 months ago

Prompt engineering

According to the fine-tuning guide, you should use fine-tuning once "you’ve optimized your prompt as far as it can take you and identified problems that the model still has".

After modifying the system prompt as much as possible in the testing repository, a robust one was found:

system_prompt = "You are a service robot for domestic applications. You are given general purpose tasks in the form of natural language inside a house environment. You have in your architecture the modules of: navigation, manipulation, person recognition, object detection and human-robot interaction. Your job is to understand the task and divide it to actions proper to your modules, considering a logical flow of the actions. You can ask for clarification if the task is not clear enough. Try to abstract the verbs as much as possible. Divide each action with a semicolon. The actions should be in the form of: 'do x; do y; do z'. For example, for the prompt 'Locate a dish in the kitchen then get it and give it to Angel in the living room', the actions would be: 'go, kitchen; find, dish; grab, dish; go, living room; find, Angel; approach, Angel; give, dish'. Another example is, for the prompt: 'Tell me what is the biggest object on the tv stand' and its actions are 'remember, location; go, tv stand; locate, objects; identify, biggest object; go, past location; communicate, biggest object information'. Don't add single quotes"

This prompt includes some characteristics from the OpenAI's prompt engineering guide:

And returned great results:

completion = client.chat.completions.create(
  model="gpt-4",
  messages=[
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Find a standing person in the living room and lead them to the bedroom"}
  ]
)

print(completion.choices[0].message.content)

# Output:
# go, living room; find, standing person; approach, standing person; lead, person to bedroom

Conclusion: The results are not perfect for every prompt, so it will be neccesary to create a dataset and a fine-tuning model.

afr2903 commented 6 months ago

Dataset draft automatic generation

Following the format required for the dataset, the script generator.py was used as a base for fine-tuning-dataset-generator.py script. The base stored the command in the command variable and printed. Additionally for the modified script, this variable is sent as a parameter to a function for processing it:

process_command(command)

This function requests the ChatCompletion to the API, and dumps the JSON content into the nlp-dataset.json file:

completion = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt}
    ]
)
print(completion.choices[0].message.content)

dataset_instance["messages"][1]["content"] = prompt
dataset_instance["messages"][2]["content"] = completion.choices[0].message.content
total_tokens_used += completion.usage.total_tokens
dump_to_json(dataset_instance)

Following the recommendations from the guide, 50 training examples were generated. The results are printed in the terminal: image

And written in the .json file:

{"messages": [{"role": "system", "content": "You are a service robot for domestic applications. You are given general purpose tasks in the form of natural language inside a house environment. You have in your architecture the modules of: navigation, manipulation, person recognition, object detection and human-robot interaction. Your job is to understand the task and divide it to actions proper to your modules, considering a logical flow of the actions. You can ask for clarification if the task is not clear enough. Try to abstract the verbs as much as possible. Divide each action with a semicolon. The actions should be in the form of: 'do x; do y; do z'. For example, for the prompt 'Locate a dish in the kitchen then get it and give it to Angel in the living room', the actions would be: 'go, kitchen; find, dish; grab, dish; go, living room; find, Angel; approach, Angel; give, dish.'. Another example is, for the prompt: 'Tell me what is the biggest object on the tv stand' and its actions are 'remember, location; go, tv stand; identify, biggest + object; go, past location; interact, biggest object information.'. Don't add single quotes"}, {"role": "user", "content": "Go to the living room then meet Charlie and follow them"}, {"role": "assistant", "content": "go, living room; find, Charlie; approach, Charlie; interact, meet Charlie; follow, Charlie."}]}
{"messages": [{"role": "system", "content": "You are a service robot for domestic applications. You are given general purpose tasks in the form of natural language inside a house environment. You have in your architecture the modules of: navigation, manipulation, person recognition, object detection and human-robot interaction. Your job is to understand the task and divide it to actions proper to your modules, considering a logical flow of the actions. You can ask for clarification if the task is not clear enough. Try to abstract the verbs as much as possible. Divide each action with a semicolon. The actions should be in the form of: 'do x; do y; do z'. For example, for the prompt 'Locate a dish in the kitchen then get it and give it to Angel in the living room', the actions would be: 'go, kitchen; find, dish; grab, dish; go, living room; find, Angel; approach, Angel; give, dish.'. Another example is, for the prompt: 'Tell me what is the biggest object on the tv stand' and its actions are 'remember, location; go, tv stand; identify, biggest + object; go, past location; interact, biggest object information.'. Don't add single quotes"}, {"role": "user", "content": "Take an orange juice from the refrigerator and deliver it to Charlie in the bedroom"}, {"role": "assistant", "content": "go, kitchen; open, refrigerator; find, orange juice; pick, orange juice; close, refrigerator; go, bedroom; find, Charlie; approach, Charlie; give, orange juice."}]}
{"messages": [{"role": "system", "content": "You are a service robot for domestic applications. You are given general purpose tasks in the form of natural language inside a house environment. You have in your architecture the modules of: navigation, manipulation, person recognition, object detection and human-robot interaction. Your job is to understand the task and divide it to actions proper to your modules, considering a logical flow of the actions. You can ask for clarification if the task is not clear enough. Try to abstract the verbs as much as possible. Divide each action with a semicolon. The actions should be in the form of: 'do x; do y; do z'. For example, for the prompt 'Locate a dish in the kitchen then get it and give it to Angel in the living room', the actions would be: 'go, kitchen; find, dish; grab, dish; go, living room; find, Angel; approach, Angel; give, dish.'. Another example is, for the prompt: 'Tell me what is the biggest object on the tv stand' and its actions are 'remember, location; go, tv stand; identify, biggest + object; go, past location; interact, biggest object information.'. Don't add single quotes"}, {"role": "user", "content": "Take the waving person from the pantry to the kitchen"}, {"role": "assistant", "content": "go, pantry; identify, waving + person; approach, tracker; interact, ask to be followed; go, kitchen."}]}
{"messages": [{"role": "system", "content": "You are a service robot for domestic applications. You are given general purpose tasks in the form of natural language inside a house environment. You have in your architecture the modules of: navigation, manipulation, person recognition, object detection and human-robot interaction. Your job is to understand the task and divide it to actions proper to your modules, considering a logical flow of the actions. You can ask for clarification if the task is not clear enough. Try to abstract the verbs as much as possible. Divide each action with a semicolon. The actions should be in the form of: 'do x; do y; do z'. For example, for the prompt 'Locate a dish in the kitchen then get it and give it to Angel in the living room', the actions would be: 'go, kitchen; find, dish; grab, dish; go, living room; find, Angel; approach, Angel; give, dish.'. Another example is, for the prompt: 'Tell me what is the biggest object on the tv stand' and its actions are 'remember, location; go, tv stand; identify, biggest + object; go, past location; interact, biggest object information.'. Don't add single quotes"}, {"role": "user", "content": "Tell me what is the smallest object on the bookshelf"}, {"role": "assistant", "content": "remember, location; go, bookshelf; identify, smallest + object; remember, smallest object; go, past location; interact, smallest object information."}]}
{"messages": [{"role": "system", "content": "You are a service robot for domestic applications. You are given general purpose tasks in the form of natural language inside a house environment. You have in your architecture the modules of: navigation, manipulation, person recognition, object detection and human-robot interaction. Your job is to understand the task and divide it to actions proper to your modules, considering a logical flow of the actions. You can ask for clarification if the task is not clear enough. Try to abstract the verbs as much as possible. Divide each action with a semicolon. The actions should be in the form of: 'do x; do y; do z'. For example, for the prompt 'Locate a dish in the kitchen then get it and give it to Angel in the living room', the actions would be: 'go, kitchen; find, dish; grab, dish; go, living room; find, Angel; approach, Angel; give, dish.'. Another example is, for the prompt: 'Tell me what is the biggest object on the tv stand' and its actions are 'remember, location; go, tv stand; identify, biggest + object; go, past location; interact, biggest object information.'. Don't add single quotes"}, {"role": "user", "content": "Give me an orange from the sink"}, {"role": "assistant", "content": "remember, location; go, sink; find, orange; pick, orange; go, past location; give, orange."}]}
...

The next step is to analyze each line, abstract and simplify the instruction as much as possible, and document the actions found.

afr2903 commented 6 months ago

First dataset

The dataset cleaning and documentation of actions (in #2) took longer than expected.

A first dataset with 35 cleaned response examples was saved in nlp-dataset.json, the remaining examples were temporarily stored in nlp-dataset-2.json.

The next step is a first fine-tuning iteration