Add evaluation qwen2.5 LLM

allenporter / home-assistant-datasets

This package is a collection of datasets for evaluating AI Models in the context of Home Assistant.

https://allenporter.github.io/home-assistant-datasets

33 stars 3 forks source link

Add evaluation qwen2.5 LLM #43

Closed xiasi0 closed 2 months ago

xiasi0 commented 2 months ago

I performed very well in the Chinese environment (the test was not professional), and I hope you can add a complete evaluation of the English environment. thank you

tannisroot commented 2 months ago

I've performed my own evaluation here https://github.com/tannisroot/home-assistant-datasets/blob/qwen2.5/reports/README.md

xiasi0 commented 2 months ago

@tannisroot I was really surprised to see your evaluation results. I didn't expect it to perform well in an English environment. Just like Llama3.1, Chinese support is not user-friendly and almost unusable. This saved me a lot of OpenAI tokens. thank you！