Evaluate: allenai/tulu-2-70b

logikon-ai / cot-eval

A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.

https://huggingface.co/spaces/logikon/open_cot_leaderboard

MIT License

9 stars 1 forks source link

Evaluate: allenai/tulu-2-70b #11

Closed ggbetz closed 6 months ago

ggbetz commented 6 months ago

Check:

[x] The model has not been evaluated yet and doesn't show up on the CoT Leaderboard.
[x] There is no evaluation request issue for the model in the repo.
[x] The parameters below have been adapted and shall be used.

Parameters:

NEXT_MODEL_PATH=allenai/tulu-2-70b
NEXT_MODEL_REVISION=main
NEXT_MODEL_PRECISION=bfloat16
MAX_LENGTH=2048 
GPU_MEMORY_UTILIZATION=0.8
VLLM_SWAP_SPACE=64

ggbetz commented 6 months ago

Completed and showing up on leaderboard.