GodXuxilie / PromptAttack

An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)
47 stars 8 forks source link

Query regarding the output files #2

Open SachinVashisth opened 1 month ago

SachinVashisth commented 1 month ago

Hi

I get two database files attack.db and check.db after running the script. I converted both files to CSV files and got two files attack.csv and check.csv.

The CSV files contain columns Q and V. In the attack.csv files, We have the prompt in the column Q and the generated sentence in the column V. But I don't know which one of the sentences in column V is a successful attack. Can you please help me with it?

luxinyayaya commented 1 month ago

Hi,

Unfortunately, these two files do not store information about successful attacks. The purpose of storing Q and V in our code is to reduce API usage and costs by avoiding repeated queries for the same prompt.

Let me know if you have further questions.

Best regards, Keyi