Fix test_calc_clipped_surrogate_objective: logprobs should all be negative

callummcdougall / ARENA_2.0

Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.

190 stars 78 forks source link

Closed Mihonarium closed 1 year ago

Mihonarium commented 1 year ago

mb_logprobs provided for the loss function should only contain nonpositive numbers