Open iankur opened 1 month ago
Great catch @iankur! I've created a PR to address this change - feel free to follow along there.
Thanks @joecummings. Do you mind sharing why does eleuther_eval recipe use evaluate
instead of simple_evaluate
method from lm-eval-harness? The reason I am asking is evaluate
does not allow few shot evaluation whereas simple_evaluate
accepts num_fewshot
argument for the same.
Thanks @joecummings. Do you mind sharing why does eleuther_eval recipe use
evaluate
instead ofsimple_evaluate
method from lm-eval-harness? The reason I am asking isevaluate
does not allow few shot evaluation whereassimple_evaluate
acceptsnum_fewshot
argument for the same.
Great question @iankur ! We wanted to provide a super simple evaluation recipe for people to check out their finetuned model's performance; however, there's definitely a lot more support we could provide for Eleuther. Is this a feature you'd like to see added soon? What's your usecase?
Does the elether_eval recipe call model.eval() somewhere? I tried to find it but could not. It's there in the lm evaluation harness but we have overriden the constructor in eval wrapper. Also, my results change when I switch model to eval mode.