Open SLZ0106 opened 1 month ago
Never mind; But I encountered a new problem when I run any attack: Traceback (most recent call last):
File "/local/home/luzsun/JailTrickBench/main.py", line 407, in
And also May I ask where I can use latest attack method like multiJail
Hi, Thank you for your interest in our work and for raising this issue!
I believe the above problems you're encountering are due to the same reason, which is the use of different versions of the FastChat package. If you installed FastChat via pip
, please note that the package has not had a new release in over 8 months, so many updates are missing from the current version, which could be causing the issue. I strongly recommend cloning the FastChat repository locally using the following commands to avoid any errors:
git clone git@github.com:lm-sys/FastChat.git
Regarding your question about how to use the multijail
attack method, it is tested in a zero-shot setting, so you can directly use the corresponding dataset in this path: baseline/MultiJail/multijail_data
. We have also integrated vllm
to accelerate the testing process, and you can run the following command to execute it:
python -u main.py \
--target_model_path lmsys/vicuna-13b-v1.5 \
--defense_type None_defense \
--attack MultiJail \
--instructions_path ./baseline/MultiJail/multijail_data/1_MultiJail_en.csv \
--save_result_path ./exp_results/main_vicuna/ \
--agent_evaluation \
--resume_exp \
--agent_recheck \
--exp_name main_vicuna_none_defense
I hope this helps! Please feel free to reach out if you have any further questions.
I have also noted the difficulty in configuring the environment, and we are preparing an updated version of the benchmark release. In the new version, we will add more baselines and new tricks, as well as provide a Docker file for easier setup and usage.
Feel free to give us a star 🌟 to stay updated, and you'll be notified as soon as the new version is released!
Traceback (most recent call last): File "/local/home/luzsun/JailTrickBench/main.py", line 407, in
main(args)
File "/local/home/luzsun/JailTrickBench/main.py", line 367, in main
all_output = run(goals, targets, target_model_path, device, args, all_output)
File "/local/home/luzsun/JailTrickBench/main.py", line 312, in run
all_output = test(goals, targets, models, device, args, all_output=all_output)
File "/local/home/luzsun/JailTrickBench/main.py", line 255, in test
curr_output = generate_attack_result(
File "/local/home/luzsun/JailTrickBench/main.py", line 59, in generate_attack_result
adv_prompt, model_output, iteration, is_JB = GCG(
File "/local/home/luzsun/JailTrickBench/baseline/GCG/GCG_single_main.py", line 71, in GCG
input_ids = suffix_manager.get_input_ids(adv_string=adv_suffix)
File "/local/home/luzsun/JailTrickBench/baseline/GCG/minimal_gcg/string_utils.py", line 129, in get_input_ids
prompt = self.get_prompt(adv_string=adv_string)
File "/local/home/luzsun/JailTrickBench/baseline/GCG/minimal_gcg/string_utils.py", line 97, in get_prompt
encoding.char_to_token(len(self.conv_template.system))
AttributeError: 'Conversation' object has no attribute 'system'
Hi I have encountered such problem.