guardrails-ai / guardrails

Adding guardrails to large language models.
https://www.guardrailsai.com/docs
Apache License 2.0
3.59k stars 258 forks source link

[feat] Support offline ML based Validators #800

Open wylansford opened 3 weeks ago

wylansford commented 3 weeks ago

Description When a Guard is instantiated with validators that specifically use huggingface pipelines, such as Toxic Language, the transformers pipeline will make requests to huggingface to see if there is a model update. This causes failures on systems with no network access. Even with caching (pre downloading the model), requests are still made.

from transformers import pipeline
model_name = "unitary/unbiased-toxic-roberta"
detoxify_pipeline = pipeline(
    "text-classification",
    model=model_name,
    function_to_apply="sigmoid",
    top_k=None,
    padding="max_length",
    truncation=True
)

Why is this needed Causes failures on systems with no network access

Implementation details Not sure at the moment. We can just catch these errors, but there should be a cleaner way

End result Any validators that use a local model should be able to be run offline.

wylansford commented 3 weeks ago

Given the docs, https://huggingface.co/docs/transformers/main/en/installation#offline-mode

It seems that a potential fix is setting the environment variable HF_HUB_OFFLINE=1

or

local_files_only=True in the model init. Given this would effect a large number of validators, I am hoping the first works well enough so we don't need to edit and support this arg moving forward.

Will see if iDeveloper on discord can reproduce functionality before updating issue.