the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders
https://huggingface.co/spaces/mike-ravkine/can-ai-code-results
MIT License
541 stars 30 forks source link

Evaluate mistralai/Codestral-22B-v0.1 (FIM and quants) #202

Open the-crypt-keeper opened 5 months ago

the-crypt-keeper commented 5 months ago

despite being hosted on HF, this model has no config.json and doesnt support inference with transformers library or any other library it seems, only their own custom mistral-inference runtime

the-crypt-keeper commented 5 months ago

Completed initial instruction eval at FP16, this is an excellent model at JavaScript especially. It used about 45GB of VRAM for inference during my testing runs so should work with 2x24GB setups.

This model also supports FIM, so will keep this issue open for that as well as any quants as they pop up.

Latest interview_cuda supports torchrun and mistral-inference runtime in an MVP capacity:

torchrun --nproc-per-node 4 ./interview_cuda.py --runtime mistral --model_name ~/models/codestral-22B-v0.1 --params params/greedy-hf.json --input results/prepare_senior_python-javascript_chat-simple.ndjson,results/prepare_junior-v2_python-javascript_chat-simple.ndjson

Adjust 4 to the number of GPU, and --model_name in this case is a directory path and not an HF path