huggingface / trl

Train transformer language models with reinforcement learning.
http://hf.co/docs/trl
Apache License 2.0
10.04k stars 1.27k forks source link

Fine-tuning question answering #1390

Closed Aekansh-Ak closed 7 months ago

Aekansh-Ak commented 8 months ago

Let's suppose I finetune this data-:

{ "instruction": "Give three tips for staying healthy.", "input": "", "output": "1. Eat a balanced diet and make sure to include plenty of fruits and vegetables. \n2. Exercise regularly to keep your body active and strong. \n3. Get enough sleep and maintain a consistent sleep schedule." }, { "instruction": "What are the three primary colors?", "input": "", "output": "The three primary colors are red, blue, and yellow." }, { "instruction": "Describe the structure of an atom.", "input": "", "output": "An atom is made up of a nucleus, which contains protons and neutrons, surrounded by electrons that travel in orbits around the nucleus. The protons and neutrons have a positive charge, while the electrons have a negative charge, resulting in an overall neutral atom. The number of each particle determines the atomic number and the type of atom." }, { "instruction": "How can we reduce air pollution?", "input": "", "output": "There are a number of ways to reduce air pollution, such as shifting to renewable energy sources, encouraging the use of public transportation, prohibiting the burning of fossil fuels, implementing policies to reduce emissions from industrial sources, and implementing vehicle emissions standards. Additionally, individuals can do their part to reduce air pollution by reducing car use, avoiding burning materials such as wood, and changing to energy efficient appliances." }

Now when I try to ask my model a question does it have to be exactly the question provided in fine-tuning data?

For example-:

{ "instruction": "How can we reduce air pollution?", "input": "", "output": "There are a number of ways to reduce air pollution, such as shifting to renewable energy sources, encouraging the use of public transportation, prohibiting the burning of fossil fuels, implementing policies to reduce emissions from industrial sources, and implementing vehicle emissions standards. Additionally, individuals can do their part to reduce air pollution by reducing car use, avoiding burning materials such as wood, and changing to energy efficient appliances." }

If I ask what can individuals additionally do to reduce air pollution, should I expect a good output or some hallucination ?

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.