Evaluate mistralai/Codestral-22B-v0.1 (FIM and quants)

Completed initial instruction eval at FP16, this is an excellent model at JavaScript especially. It used about 45GB of VRAM for inference during my testing runs so should work with 2x24GB setups.

This model also supports FIM, so will keep this issue open for that as well as any quants as they pop up.

Latest interview_cuda supports torchrun and mistral-inference runtime in an MVP capacity:

torchrun --nproc-per-node 4 ./interview_cuda.py --runtime mistral --model_name ~/models/codestral-22B-v0.1 --params params/greedy-hf.json --input results/prepare_senior_python-javascript_chat-simple.ndjson,results/prepare_junior-v2_python-javascript_chat-simple.ndjson

Adjust 4 to the number of GPU, and --model_name in this case is a directory path and not an HF path

the-crypt-keeper / can-ai-code

Evaluate mistralai/Codestral-22B-v0.1 (FIM and quants) #202