[Feature] Add Cleanlab's Trustworthiness Score

NVIDIA / NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Other

3.72k stars 325 forks source link

[Feature] Add Cleanlab's Trustworthiness Score #572

Open AshishSardana opened 1 week ago

AshishSardana commented 1 week ago

Cleanlab provides trustworthiness score that can be used for guardrailing tasks. Read more about how this score is calculated here.

This PR aims to add support for a 3rd party API i.e. Cleanlab Studio's API, to enable users to use trustworthiness score for using guardrails on output.

AshishSardana commented 9 hours ago

Hi @drazvan, I'd like to add brief documentation about trustworthiness score and configurable parameter in the rail (threshold on least_trustworthiness_score). Would it go here?

Also, how do I add tests? Do you store the keys on the CI server to be able to run the tests?

drazvan commented 3 hours ago

@AshishSardana : Yes, you can add it in the library documentation. We are in the process of refactoring the structure a bit, but we'll change this as well once that's finalized.