Evaluate llama-3.1 - Githubissues

the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders

https://huggingface.co/spaces/mike-ravkine/can-ai-code-results

MIT License

541 stars 30 forks source link

Evaluate llama-3.1 #218

Closed the-crypt-keeper closed 3 months ago

the-crypt-keeper commented 3 months ago

Going to give this a week to settle, there's always bugs when quants first land.

the-crypt-keeper commented 3 months ago

vLLM issue: https://github.com/vllm-project/vllm/issues/6689

the-crypt-keeper commented 3 months ago

Gathered some early results which only confirmed by fears: there's likely bugs. 8B q6k did very poorly and 70B nf4 also looks suspect. Note that 70B NF4 did not fit into either 2x24GB or 40GB only an 80GB.

the-crypt-keeper commented 3 months ago

https://github.com/ggerganov/llama.cpp/commit/b5e95468b1676e1e5c9d80d1eeeb26f542a38f42

GGUF metadata has been extended to support precalculated RoPEs. New GGUFs need to get made.

the-crypt-keeper commented 3 months ago

8B works with llama.cpp 705b7ecf and kobold.cpp e47477fd4d the 70B looks suspicious still