Azure / PyRIT

The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.
MIT License
1.76k stars 330 forks source link

FEAT add DecodingTrust dataset #291

Open romanlutz opened 2 months ago

romanlutz commented 2 months ago

Is your feature request related to a problem? Please describe.

DecodingTrust data should be available via PyRIT https://github.com/AI-secure/DecodingTrust

Describe the solution you'd like

There should be a fetch function similar to #267 in pyrit.datasets.

Describe alternatives you've considered, if relevant

Additional context

### Tasks
- [ ] https://github.com/Azure/PyRIT/pull/385
- [ ] Toxicity
- [ ] Adversarial Robustness
- [ ] OOD Robustness
- [ ] Robustness on Adversarial Demonstrations
- [ ] Privacy
- [ ] Machine Ethics
- [ ] Fairness
jsong468 commented 2 weeks ago

Will try to take on this one, thanks!!

jsong468 commented 1 week ago

Added ability to fetch the 'Stereotypes' prompts data, but there are still 7 other trustworthiness perspectives with data that may (or some may not) be useful in PyRIT! Take a look here: https://github.com/AI-secure/DecodingTrust/tree/main/data

nina-msft commented 1 week ago

Take into consideration, before picking up a dataset, that the contents of these datasets include profanity and topics that you may not want to deal with.