Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models

Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
183 stars 16 forks source link

Unable to run #1

Closed pearlmary closed 1 year ago

pearlmary commented 1 year ago

Everytime, I try the commands given in the repo, I'm always facing this error. Can you guide me on this?

(auto) root@537614b35cbf:/workspace/VAJB# python visual_attack.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0 --n_iters 5000 --constrained --eps 16 --alpha 1 --save_dir visual_constrained_eps_16

Initializing Models Loading VIT Loading VIT Done Loading Q-Former Loading Q-Former Done Loading LLAMA Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████| 2/2 [00:15<00:00, 7.96s/it] Loading LLAMA Done Load BLIP2-LLM Checkpoint: /workspace/VAJB/ckpts/pretrained_minigpt4_7b.pth [Initialization Finished]

pr Give the following image: ImageContent. You will be able to see the image once I provide it to you. Please answer my questions.###Human: ###Assistant:

batch_size: 8 0%| | 0/5001 [00:00<?, ?it/s]target_loss: 3.312500 ######### Output - Iter = 0 ########## /workspace/auto/lib/python3.10/site-packages/transformers/generation/utils.py:1219: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation) warnings.warn( 0%| | 0/5001 [00:07<?, ?it/s] ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /workspace/VAJB/visual_attack.py:126 in │ │ │ │ 123 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ num_iter=5000, alpha=args.al │ │ 124 │ │ 125 else: │ │ ❱ 126 │ adv_img_prompt = my_attacker.attack_constrained(text_prompt_template, │ │ 127 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ img=img, batch_size= 8, │ │ 128 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ num_iter=5000, alpha=args.al │ │ 129 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ epsilon=args.eps / 255) │ │ │ │ /workspace/VAJB/utils/visual_attacker.py:157 in attack_constrained │ │ │ │ 154 │ │ │ │ prompt.img_embs = prompt.img_embs batch_size │ │ 155 │ │ │ │ prompt.update_context_embs() │ │ 156 │ │ │ │ with torch.nograd(): │ │ ❱ 157 │ │ │ │ │ response, = my_generator.generate(prompt) │ │ 158 │ │ │ │ print('>>>', response) │ │ 159 │ │ │ │ │ │ 160 │ │ │ │ adv_img_prompt = denormalize(x_adv).detach().cpu() │ │ │ │ /workspace/VAJB/utils/generator.py:40 in generate │ │ │ │ 37 │ │ │ 38 │ def generate(self, prompt): │ │ 39 │ │ │ │ ❱ 40 │ │ outputs = self.model.llama_model.generate( │ │ 41 │ │ │ inputs_embeds=prompt.context_embs[0], │ │ 42 │ │ │ max_new_tokens=self.max_new_tokens, │ │ 43 │ │ │ stopping_criteria=self.stopping_criteria, │ │ │ │ /workspace/auto/lib/python3.10/site-packages/torch/utils/_contextlib.py:115 in decorate_context │ │ │ │ 112 │ @functools.wraps(func) │ │ 113 │ def decorate_context(args, kwargs): │ │ 114 │ │ with ctx_factory(): │ │ ❱ 115 │ │ │ return func(*args, kwargs) │ │ 116 │ │ │ 117 │ return decorate_context │ │ 118 │ │ │ │ /workspace/auto/lib/python3.10/site-packages/transformers/generation/utils.py:1485 in generate │ │ │ │ 1482 │ │ │ ) │ │ 1483 │ │ │ │ │ 1484 │ │ │ # 13. run sample │ │ ❱ 1485 │ │ │ return self.sample( │ │ 1486 │ │ │ │ input_ids, │ │ 1487 │ │ │ │ logits_processor=logits_processor, │ │ 1488 │ │ │ │ logits_warper=logits_warper, │ │ │ │ /workspace/auto/lib/python3.10/site-packages/transformers/generation/utils.py:2524 in sample │ │ │ │ 2521 │ │ │ model_inputs = self.prepare_inputs_for_generation(input_ids, model_kwargs) │ │ 2522 │ │ │ │ │ 2523 │ │ │ # forward pass to get next token │ │ ❱ 2524 │ │ │ outputs = self( │ │ 2525 │ │ │ │ *model_inputs, │ │ 2526 │ │ │ │ return_dict=True, │ │ 2527 │ │ │ │ output_attentions=output_attentions, │ │ │ │ /workspace/auto/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │ │ │ │ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │ │ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │ │ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │ │ ❱ 1501 │ │ │ return forward_call(args, kwargs) │ │ 1502 │ │ # Do not call functions when jit is used │ │ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │ │ 1504 │ │ backward_pre_hooks = [] │ │ │ │ /workspace/auto/lib/python3.10/site-packages/accelerate/hooks.py:158 in new_forward │ │ │ │ 155 │ │ │ with torch.no_grad(): │ │ 156 │ │ │ │ output = old_forward(*args, kwargs) │ │ 157 │ │ else: │ │ ❱ 158 │ │ │ output = old_forward(*args, *kwargs) │ │ 159 │ │ return module._hf_hook.post_forward(module, output) │ │ 160 │ │ │ 161 │ module.forward = new_forward │ │ │ │ /workspace/VAJB/minigpt4/models/modeling_llama.py:676 in forward │ │ │ │ 673 │ │ return_dict = return_dict if return_dict is not None else self.config.use_return │ │ 674 │ │ │ │ 675 │ │ # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn) │ │ ❱ 676 │ │ outputs = self.model( │ │ 677 │ │ │ input_ids=input_ids, │ │ 678 │ │ │ attention_mask=attention_mask, │ │ 679 │ │ │ position_ids=position_ids, │ │ │ │ /workspace/auto/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │ │ │ │ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │ │ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │ │ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │ │ ❱ 1501 │ │ │ return forward_call(args, kwargs) │ │ 1502 │ │ # Do not call functions when jit is used │ │ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │ │ 1504 │ │ backward_pre_hooks = [] │ │ │ │ /workspace/auto/lib/python3.10/site-packages/accelerate/hooks.py:158 in new_forward │ │ │ │ 155 │ │ │ with torch.no_grad(): │ │ 156 │ │ │ │ output = old_forward(*args, *kwargs) │ │ 157 │ │ else: │ │ ❱ 158 │ │ │ output = old_forward(args, kwargs) │ │ 159 │ │ return module._hf_hook.post_forward(module, output) │ │ 160 │ │ │ 161 │ module.forward = new_forward │ │ │ │ /workspace/VAJB/minigpt4/models/modeling_llama.py:517 in forward │ │ │ │ 514 │ │ │ ) │ │ 515 │ │ │ position_ids = position_ids.unsqueeze(0).view(-1, seq_length) │ │ 516 │ │ else: │ │ ❱ 517 │ │ │ position_ids = position_ids.view(-1, seq_length).long() │ │ 518 │ │ │ │ 519 │ │ # embed positions │ │ 520 │ │ if attention_mask is None: │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: shape '[-1, 81]' is invalid for input of size 82**

Unispac commented 1 year ago

Hi. I will help you with troubleshooting.

  1. Could you please confirm that you set up a separate environment and install packages following the instructions of this repo? i.e.,
    
    git clone https://github.com/Unispac/Visual-Adversarial-Examples-Jailbreak-Large-Language-Models.git

cd Visual-Adversarial-Examples-Jailbreak-Large-Language-Models

conda env create -f environment.yml conda activate minigpt4


This is important as the version of the transformers package often matters.

2. Could you confirm that you are using the 13b version of minigpt-4 checkpoints built from vicuna-13b-v0?
Currently, I see you are using the 7b version. (might be problematic, because all of our configurations are set for 13b version)

If you still meet problems, please feel free to let us know. 
pearlmary commented 1 year ago

Thank you for helping out.

  1. I've set up a separate environment and install packages following the instructions of this repo? i.e.,

git clone https://github.com/Unispac/Visual-Adversarial-Examples-Jailbreak-Large-Language-Models.git

cd Visual-Adversarial-Examples-Jailbreak-Large-Language-Models

conda env create -f environment.yml conda activate minigpt4

  1. As you rightly said, I've used 7b version since 13b will require more intensive GPU it seems:
  2. I've used the vicuna-7b-v1.3 (https://huggingface.co/lmsys/vicuna-7b-v1.3) because they provide merged weights directly instead of delta weights in (lmsys/vicuna-7b-delta-v0). One more reason is that llama weights are hard to get (waiting for confirmation mail after request submission to Meta).
  3. Also downloaded the pretrained checkpoints for miniGPT4 aligned with Vicuna 7B and given paths correctly.
  4. Still the same error. Can you guide me on this? Many Thanks in Advance.
Unispac commented 1 year ago

Hi, I suspect that this is due to the reason that you are using v1.3 instead of v0.

  1. Please refer to this docs from the vicuna repository: https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md

    1) It's important to note that later versions use a different separator that that used by v0. 2) Different model versions also have different source code compatibility.

  2. For compatibility with MiniGPT-4 implementation, I think we should also use v0. Because the linear projection layer learned by the MiniGPT-4 is w.r.t to v0 model. If you use models of other versions, the model will not function that well.

I understand the requirement for separately downloading llama is a bottleneck for setting up the v0 version. However, since MiniGPT-4 is trained with v0 model, we have to stick to that.

pearlmary commented 1 year ago

Thank you for the reply. So, you want me to stick with v0 version due to the reasons mentioned above. Accepted. But still I have few doubts?

  1. Even if we download the llama version, can I successfully use 7b (7b mimigpt4 checkpoints/vicuna7b weights) version or should we use only 13b mimigpt4 checkpoints/vicuna13b weights?
  2. Also wanted to know whether we can use multiple gpu such as gpu 0,1 or only one? Many Thanks in Advance.
Unispac commented 1 year ago

Hi,

  1. I think there is no barrier preventing you to attack the 7b version model. As long as the model is correctly set up according to MiniGPT-4 repository. Overall, we are providing an attack algorithm. As long as the model is end-to-end differentiable, the same attack loop can be applied to the model. Beside MiniGPT-4, we also attacked LLaVA and InstructBLIP recently. We would be happy to release more examples soon.

  2. Currently, we implement the pipeline on a single gpu only. If you want to do multi-gpus, I think you need to slightly change the codes a little bit using multi-gpu supports of pytorch.

pearlmary commented 1 year ago
  1. Beside MiniGPT-4, we also attacked LLaVA and InstructBLIP recently. We would be happy to release more examples soon.

That's great news! Waiting to see how the jailbreak is...

Hi, Finally need to clarify this.

Will the 7b setup for demo work on 32GB GPU machine or do we need a higher one?

Thank You.

Unispac commented 1 year ago

Hi, I think 32 GB should be enough for the 7b version. If you run the attack with a batch size of 8 and find that the memory is not enough, you may also reduce the batch size to 6 or 4. Then, I believe it would work.