pytorch / torchtune

A Native-PyTorch Library for LLM Fine-tuning
https://pytorch.org/torchtune/main/
BSD 3-Clause "New" or "Revised" License
3.63k stars 298 forks source link

Model eval #1047

Open iankur opened 1 month ago

iankur commented 1 month ago

Does the elether_eval recipe call model.eval() somewhere? I tried to find it but could not. It's there in the lm evaluation harness but we have overriden the constructor in eval wrapper. Also, my results change when I switch model to eval mode.

joecummings commented 1 month ago

Great catch @iankur! I've created a PR to address this change - feel free to follow along there.

iankur commented 2 weeks ago

Thanks @joecummings. Do you mind sharing why does eleuther_eval recipe use evaluate instead of simple_evaluate method from lm-eval-harness? The reason I am asking is evaluate does not allow few shot evaluation whereas simple_evaluate accepts num_fewshot argument for the same.

joecummings commented 4 days ago

Thanks @joecummings. Do you mind sharing why does eleuther_eval recipe use evaluate instead of simple_evaluate method from lm-eval-harness? The reason I am asking is evaluate does not allow few shot evaluation whereas simple_evaluate accepts num_fewshot argument for the same.

Great question @iankur ! We wanted to provide a super simple evaluation recipe for people to check out their finetuned model's performance; however, there's definitely a lot more support we could provide for Eleuther. Is this a feature you'd like to see added soon? What's your usecase?