mlabonne / llm-autoeval

Automatically evaluate your LLMs in Google Colab
MIT License
460 stars 77 forks source link

Add functionality for evaluating model safety/toxicity #28

Open m-newhauser opened 2 months ago

m-newhauser commented 2 months ago

The current evaluation metrics supported by llm-eval are robust. However, upon reviewing the documentation, I found that the current repo doesn't account for evaluating model toxicity. Assessing LLMs for toxicity is tricky and there are (surprisingly) few comprehensive, tested open source solutions for doing so. I've identified a few options that could be added to the llm-eval Google Colab notebook.

TrustLLM

I'm more than happy to further discuss and pick this issue up myself!

mlabonne commented 2 months ago

Sorry it took so long for me to respond. Yeah, I think this would be a great addition to llm autoeval. You're more than welcome to add it if you're still interested! :)