Pinaka-ai / SchemaBottleneck

0 stars 0 forks source link

Write code for supervised fine tuning of the model (use T5) #5

Closed nightlessbaron closed 4 months ago

nightlessbaron commented 4 months ago

Input will be: situation + intention + action Output will be: questioner of dimensions against which we want to evaluate morality Make sure that you also add prompts.

Example:

INPUT =>
Situation: David is taking an exam in his calculus class that he didn't study for.
Intent: David wants to pass the exam.
Action: David takes the exam to the best of his ability and leaves comments to his teacher on the items he doesn't understand.

OUTPUT =>
Schema:
1. How honest was the agent in communicating their actions and intentions in the scenario?
2. How responsible was the agent in fulfilling their obligations in the given scenario? 
3. Are the agent's actions fair to all parties involved?
4. How much integrity did the agent demonstrate in handling the situation according to their ethical standards?

Tip: You can refer RL4F (https://github.com/feyzaakyurek/rl4f/blob/master/scripts/training/train_text_generation.py#L51) on how they have done this.

nightlessbaron commented 4 months ago

Merged with #9