Closed drazvan closed 3 months ago
This PR adds new tooling for evaluating a guardrail configuration. NOTE: documentation is minimal; still WIP.
Below is a quick overview for the nemoguardrails eval CLI.
nemoguardrails eval
To run a new evaluation with a guardrail configuration:
nemoguardrails eval run -g <GUARDRAIL_CONFIG_PATH> -o <OUTPUT_PATH>
To check the compliance with the policies, you can use the LLM-as-a-judge method.
nemoguardrails eval check-compliance --llm-judge=<LLM_MODEL_NAME> -o <OUTPUT_PATH>
You can use any LLM supported by NeMo Guardrails.
models: - type: llm-judge engine: openai model: gpt-4 - type: llm-judge engine: nvidia_ai_endpoints model: meta/llama3-70b-instruct
To review and analyze the results, launch the NeMo Guardrails Eval UI:
nemoguardrails eval ui
This PR adds new tooling for evaluating a guardrail configuration. NOTE: documentation is minimal; still WIP.
Below is a quick overview for the
nemoguardrails eval
CLI.Run Evaluations
To run a new evaluation with a guardrail configuration:
Check Compliance
To check the compliance with the policies, you can use the LLM-as-a-judge method.
You can use any LLM supported by NeMo Guardrails.
Review and Analyze
To review and analyze the results, launch the NeMo Guardrails Eval UI: