Azure / PyRIT

The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.
MIT License
1.72k stars 315 forks source link

FEAT Local Hugging Face model support #347

Open EricXQiu opened 3 weeks ago

EricXQiu commented 3 weeks ago

Is your feature request related to a problem? Please describe.

We have a few downloaded Hugging Face models and we would like to use PyRIT for AI red teaming. I didn;t find any Prompt Target fits this case. Any support for local HF model would be great.

Describe the solution you'd like

We would like to have a prompt target and Score engine for local Hugging Face model for a chatbot setup. The model is already downloaded and stored, it can be loaded by Hugging Face transformers and the dataset can also be loaded from Hugging Face.

An example model can be https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

Describe alternatives you've considered, if relevant

Additional context

romanlutz commented 2 weeks ago

@EricXQiu that's a great idea, thanks for the suggestion! We have lots of examples of how to write targets so anyone should be able to pick this up. If you want you could try it yourself (please comment here if you intend to do so, otherwise there may be duplication). We'll obviously provide feedback and guidance along the way.

FWIW this is something we hear regularly and want to have support for it at some point although I don't have a specific timeline at this point.

KutalVolkan commented 2 weeks ago

Hello @romanlutz and @EricXQiu,

If nobody else wants to take it, I’d be happy to handle the issue. I’ll wait until the end of the weekend, and then I’ll start on it.

romanlutz commented 2 weeks ago

No need to wait. It's yours!

romanlutz commented 2 weeks ago

Update: We've had this before and removed it, but want it back with modifications. There is some context that I should share and I will, but probably only tomorrow. We will need to hash out what this should look like... obviously happy to receive a proposal from you @KutalVolkan if it's of interest to you.

KutalVolkan commented 2 weeks ago

Update: We've had this before and removed it, but want it back with modifications. There is some context that I should share and I will, but probably only tomorrow. We will need to hash out what this should look like... obviously happy to receive a proposal from you @KutalVolkan if it's of interest to you.

Hello Roman,

Once I have a better understanding of the requirements, I'll be happy to propose a detailed plan and start working on it.

Looking forward to your input!

romanlutz commented 2 weeks ago

This is the PR that removed the old target: https://github.com/Azure/PyRIT/pull/120

We removed it because we couldn't run the notebook which used the HuggingFaceChat target because the model took a long time to import.

IMO This probably boils down to two cases:

The former seems reasonable to support. The old target code may be a good starting point, although it's worth noting that targets have evolved a little since then (async, retries, to name a few) and adjustments will be required.

The latter is already supported via our Azure ML Target (where one can deploy HF models fairly seamlessly) BUT one could certainly add a HF target for talking to the endpoint in their cloud.

Work Required:

Adding @rdheekonda to add to this since he's looked into it before, similarly @rlundeen2.

@KutalVolkan feel free to play around with this and let us know what you think. We're pretty open to suggestions here.