Closed tannisroot closed 4 days ago
Here are the steps i'm running to produce the model eval.
$ source venv/bin/activate
$ pip3 install -r requirements_dev.txt
models.yaml
with the new model. - model_id: mistral-nemo
domain: ollama
description: A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
urls:
- https://mistral.ai/news/mistral-nemo/
- https://ollama.com/library/mistral-nemo
config_entry_data:
url: !secret ollama_url
model: mistral-nemo:12b
config_entry_options:
llm_hass_api: assist
num_ctx: 8192 # Note: Model has 128k context length
ollama pull mistral-nemo
$ export PYTHONPATH="${PYTHONPATH}:${PWD}"
$ DATASET="datasets/assist-mini/"
$ pip3 freeze | grep homeassistant # check current home assistant version
$ MODEL_OUTPUT_DIR="reports/assist-mini/2024.9.2"
$ MODEL=mistral-nemo
$ home-assistant-datasets assist collect --model_output_dir=${MODEL_OUTPUT_DIR} --dataset=${DATASET} --models=${MODEL}
$ home-assistant-datasets assist eval --model_output_dir=${MODEL_OUTPUT_DIR} --output_type=csv > ${MODEL_OUTPUT_DIR}/report.csv
$ home-assistant-datasets leaderboard prebuild
$ home-assistant-datasets leaderboard build
Leaderboard updated, mistral-nemo scores 81% on assist-mini
This new relatively small
mistral-nemo
local model is shockingly capable at Home Assistant tasks. I think if these tests are used by HA in the future to give the current state of LLM conversation agents, it would be interesting to have that model in the lineup too!