Pinaka-ai / SchemaBottleneck

0 stars 0 forks source link

Build RL Policy for schema generation #6

Closed nightlessbaron closed 5 months ago

nightlessbaron commented 5 months ago

Pipeline should look like this:

Input: Scenario
Schemas = TaskSG(Scenario)
Output = TaskLM(Scenario, schemas)
Reward = RewardFN(Output, HumanScore)
Update = TaskSG(Reward)
logisticloon commented 5 months ago

Duplicate.