guardrails-ai / guardrails

Adding guardrails to large language models.
https://www.guardrailsai.com/docs
Apache License 2.0
3.72k stars 272 forks source link

`RegexMatch` not reporting failures [bug] #709

Open jsoma opened 3 months ago

jsoma commented 3 months ago

Describe the bug

When a validation fails with RegexMatch it is listed as successfully validating.

To Reproduce

import openai
from pydantic import BaseModel, Field
from guardrails.hub import RegexMatch
from guardrails import Guard

prompt = """
    Generate a fake user.

    ${gr.complete_json_suffix_v2}
"""

class User(BaseModel):
    name: str = Field(description="Name that is NOT potato", validators=[
        RegexMatch(regex='potato', match_type='search', on_fail='reask')
    ])

guard = Guard.from_pydantic(output_class=User, prompt=prompt)

result = guard(
    llm_api=openai.chat.completions.create,
    num_reasks=0
)

The LLM gives us "John Doe"as the name, which fails the potato check. Despite this it's listed as passing validation, and the "fixed" value is provided.

ValidationOutcome(
    raw_llm_output='{"name":"John Doe"}', 
    validated_output={'name': 'uhdazcppotatouhdazcp'}, 
    reask=None, 
    validation_passed=True,
    error=None
)

Expected behavior

If you replace the RegexpMatch validator with ValidChoices(choices=['potato'], on_fail='reask'), you get the following (Alice fails the potato check).

ValidationOutcome(
    raw_llm_output='{"name":"Alice"}',
    validated_output=None,
    reask=None,
    validation_passed=False,
    error=None
)

Library version:

0.4.3

zsimjee commented 3 months ago

Hi! Can you please try with v 0.4.2 and see if that yields the same issue? I did mess with related logic in 0.4.3 which came out on Tuesday

zsimjee commented 3 months ago

I believe this is applying fix values even when the on fail action is reask

jsoma commented 3 months ago

Nope, no luck even back to 0.4.0 (although my env is a little wonky and guardrails doesn't have __version__, so there's like a 3% chance it's just my setup not downgrading successfully)

CalebCourier commented 3 months ago

@zsimjee and @jsoma Wanted to add some context here:

This is the expected behaviour and has been since 0.3.x and likely even before. That behaviour being that after all validation and reasks are complete, if there are any reasks left where the ValidationResult has a fix_value, that fix_value is applied. The key difference between RegexMatch and ValidChoices is that ValidChoices does not include a fix_value in its ValidationResult hence a failure instead of a substituted result.

I agree that this behaviour may not be intuitive and should be examined for revision, but the library is acting as it is intended to.

Edit: typo

jsoma commented 3 months ago

Ah, got it. Guess it gives me a good opportunity to get a little more comfortable with custom validators.

@register_validator(name="regex-validator", data_type="string")
class RegexValidator(Validator):
    def __init__(self, regex: str, on_fail: Optional[Callable] = None):
        super().__init__(on_fail=on_fail, regex=regex)
        self._regex = regex

    def validate(self, value: str, metadata: Dict) -> ValidationResult:
        regex = re.compile(self._regex)
        if not regex.fullmatch(value):
            return FailResult(
                error_message=f"Result must match regular expression /{self._regex}/",
            )
        return PassResult()

Thanks for the update!