Very slow alpaca-eval times for OLMo models

allenai / open-instruct

Apache License 2.0

1.21k stars 166 forks source link

Very slow alpaca-eval times for OLMo models #143

Closed dwadden closed 2 months ago

dwadden commented 5 months ago

Alpaca-eval on OLMo models is very slow -- maybe just because OLMo can't use vllm and huggingface in general is slow? Here's an example Beaker job; based on the TQDM log it will take 100 hours (~4 days) to evaluate 800 examples. This isn't really a workable solution. Potential options:

Get OLMo to play nicely with vllm. We'd probably need to recruit an engineer to help with this.
Evaluate on a subset of maybe 100 instances for OLMo models. Still not ideal but better than nothing.

@hamishivi @yizhongw any thoughts?

hamishivi commented 5 months ago

My understanding is that vLLM integration of OLMo was at some point being looked at by @AkshitaB , although not sure on the current setup. As for the second, feel free to add a subset flag for this, since it might be something useful for debugging anyway. Just reducing the prompt set used should work naively with the existing code (I do this myself for debugging).

AkshitaB commented 5 months ago

@dwadden vllm supports OLMo in their latest version already, you should be able to use it directly. You'll need to convert the olmo checkpoint to HF format using the conversion script.

Also, make sure to use the latest vllm, since they fixed a bug with tensor parallel case in this commit after their last pip release.

hamishivi commented 5 months ago

thanks akshita!!!