Evaluate: Qwen/Qwen1.5-MoE-XX

logikon-ai / cot-eval

A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.

MIT License

12 stars 2 forks source link

For XX in [A2.7B-Chat, A2.7B]:

Check upon issue creation:

[x] The model has not been evaluated yet and doesn't show up on the CoT Leaderboard.
[x] There is no evaluation request issue for the model in the repo.
[x] The parameters below have been adapted and shall be used.

Parameters:

NEXT_MODEL_PATH=Qwen/Qwen1.5-MoE-{XX}
NEXT_MODEL_REVISION=main
NEXT_MODEL_PRECISION=bfloat16
MAX_LENGTH=2048 
GPU_MEMORY_UTILIZATION=0.7
VLLM_SWAP_SPACE=8

ToDos:

[ ] Run cot-eval pipeline
[ ] Merge pull requests for cot-eval results datats (> @ggbetz)
[ ] Create eval request record to update metadata on leaderboard (> @ggbetz)

logikon-ai / cot-eval

Evaluate: Qwen/Qwen1.5-MoE-XX #45