The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.
MIT License
1.91k
stars
366
forks
source link
FEAT: Job Role Generator attack module from Project Moonshot #506
This PR adds the Job Role Generator attack module from Project Moonshot. It is accompanied by a default template job_role_converter.yaml file. The attack module asks the target to identify whether a certain demographic (i.e. gender, ethnicity) is more proficient at the job given by the prompt. This tests for stereotypical/biased representation within the system.
Job Role Generator: This attack module adds demographic groups to the job role.
Related: #427, with parent issue #376
Tests and Documentation
test_job_role_converter.py runs minor, static tests to ensure the source code generates the manual prompts correctly and handles "text" input types.
job_role_generator.ipynb has been generated by JupyText within the doc folder. This notebook follows the function perform_attack_manually() from Project Moonshot.
(P.S. Accidentally committed from alternate git config details; All changes are from me 😸)
@Wren-cpu are you still planning on finishing up this PR? Just asking since we can help bring it over the finish line in case you don't. No pressure if you need more time!
Description
This PR adds the Job Role Generator attack module from Project Moonshot. It is accompanied by a default template
job_role_converter.yaml
file. The attack module asks the target to identify whether a certain demographic (i.e. gender, ethnicity) is more proficient at the job given by the prompt. This tests for stereotypical/biased representation within the system.Job Role Generator: This attack module adds demographic groups to the job role.
Related: #427, with parent issue #376
Tests and Documentation
test_job_role_converter.py
runs minor, static tests to ensure the source code generates the manual prompts correctly and handles "text" input types.job_role_generator.ipynb
has been generated by JupyText within thedoc
folder. This notebook follows the functionperform_attack_manually()
from Project Moonshot.(P.S. Accidentally committed from alternate git config details; All changes are from me 😸)