Open matengxiaotiancai opened 2 months ago
Hello @matengxiaotiancai,
We mainly utilized models PixArt XL 2 and LLaVA-1.5 for these two steps. You can refer to their respective repositories for instructions on setting up their environments.
Hello @matengxiaotiancai,
We mainly utilized models PixArt XL 2 and LLaVA-1.5 for these two steps. You can refer to their respective repositories for instructions on setting up their environments.
Thank you for your reply! I have another question. May I ask what does' flagged 'mean when it is True or False in the file' generate_init_image.py '? I didn't understand the following code: processed_data = {} for entry in raw_data: entry_id = entry['id'] print(f"Processing entry {entry_id}")
if entry_id in processed_data and entry['flagged']:
if int(entry['step']) < int(processed_data[entry_id]['step']):
processed_data[entry_id] = entry
elif entry_id not in processed_data:
processed_data[entry_id] = entry
elif not entry['flagged'] and processed_data[entry_id]['flagged']:
continue
elif entry['step'] == "3":
processed_data[entry_id] = entry
Hi @matengxiaotiancai,
This part of the code is primarily designed to filter out samples that successfully jailbreak target MLLM during the black-box optimization process. In our experiments, we set five steps to jailbreak target MLLM with black-box optimization. If a successful attack sample appears at any of these five steps, we will save the samples and employ early stop in our optimization process, i,e., if the jailbreak is completed in the second step, we retain the results from the second step. If none of the five steps successfully completed the jailbreak, we retain the results from the fifth step.
processed_data = {}
for entry in raw_data:
entry_id = entry['id']
# If the id is in processed_data and the current entry is flagged, check the step
if entry_id in processed_data and entry['flagged']:
# If the current entry's step is smaller, replace it
if int(entry['step']) < int(processed_data[entry_id]['step']):
processed_data[entry_id] = entry
elif entry_id not in processed_data:
# If the id is not in processed_data, then add it
processed_data[entry_id] = entry
# If the current entry is not flagged, but the same id in processed_data is flagged, skip it
elif not entry['flagged'] and processed_data[entry_id]['flagged']:
continue
# If the current entry's step is 5, replace it (only executes if there are no flagged entries)
elif entry['step'] == "5":
processed_data[entry_id] = entry
# Ensure the retained samples are only those flagged as true, and have a step of 5 (if no flagged samples exist)
output_list = list(processed_data.values())
Hello, I am very interested in your outstanding work and would like to reproduce your project. Therefore, I would like to inquire about the specific environment for executing the steps of STEP 2 Amplifying Image Harmfulness with LLMs, and STEP 3 Amplifying Image Harmfulness with Grade Update, in order to ensure that the reproduced results are consistent with yours.