Closed jamesbraza closed 2 days ago
This PR:
EVAL_PROMPT_TEMPLATE
discounted_returns
TaskDataset
distractors
Sequence
Closes https://github.com/Future-House/paper-qa/issues/693
This PR:
EVAL_PROMPT_TEMPLATE
discounted_returns
logic and incorrect reward indicesdiscounted_returns
and theTaskDataset
distractors
be aSequence
to avoid in-place edits, further robust-ification after https://github.com/Future-House/paper-qa/pull/694Closes https://github.com/Future-House/paper-qa/issues/693