Build RL Policy for schema generation - Githubissues

Pinaka-ai / SchemaBottleneck

0 stars 0 forks source link

Build RL Policy for schema generation #6

Closed nightlessbaron closed 5 months ago

nightlessbaron commented 5 months ago

[ ] Use the checkpoints from Alphabetize RL4F for now, and construct the pipeline for training the schema generator (T5) via reward.
[ ] Add more tasks as you explore the pipelines ....

Pipeline should look like this:

Input: Scenario
Schemas = TaskSG(Scenario)
Output = TaskLM(Scenario, schemas)
Reward = RewardFN(Output, HumanScore)
Update = TaskSG(Reward)

logisticloon commented 5 months ago

Duplicate.