Text Quality
readability score
complexity and grade scores
Text Relevance
Similarity scores between prompt/responses
Similarity scores against user-defined themes
Security and Privacy
patterns - count of strings matching a user-defined regex pattern group
jailbreaks - similarity scores with respect to known jailbreak attempts
prompt injection - similarity scores with respect to known prompt injection attacks
hallucinations - consistency check between responses
refusals - similarity scores with respect to known LLM refusal of service responses
Sentiment and Toxicity
sentiment analysis
toxicity analysis
https://github.com/whylabs/langkit
https://github.com/whylabs/whylogs
Text Quality readability score complexity and grade scores Text Relevance Similarity scores between prompt/responses Similarity scores against user-defined themes Security and Privacy patterns - count of strings matching a user-defined regex pattern group jailbreaks - similarity scores with respect to known jailbreak attempts prompt injection - similarity scores with respect to known prompt injection attacks hallucinations - consistency check between responses refusals - similarity scores with respect to known LLM refusal of service responses Sentiment and Toxicity sentiment analysis toxicity analysis