Feat/content safety - Githubissues

NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Other

4.22k stars 402 forks source link

Feat/content safety #674

Closed Pouyanpi closed 3 months ago

Pouyanpi commented 3 months ago

This PR introduces the content safety module, enabling users to utilize various models through a "content safety check input/output" flow.

Key enhancements include:

Parameter support in input/output rails flow definitions.
The addition of the is_content_safe output parser, which is shared between the content_safety and self_check modules. This new implementation maintains backward compatibility.
The introduction of a max_tokens field in TaskPrompt.

TODO and Remarks:

[x] Make shieldgemma example work, currently shieldgemma is not behaving correctly.
[ ] _MAX_TOKENS or default value of max_tokens in TaskPrompt
[ ] check functionality within colang 2
[x] Documentation

Pouyanpi commented 3 months ago

also resolves #552