microsoft / promptbench

A unified evaluation framework for large language models
http://aka.ms/promptbench
MIT License
2.48k stars 183 forks source link

bertattack #88

Open zl-comment opened 2 weeks ago

zl-comment commented 2 weeks ago

After I performed a bertattack on the previous data set, the bertattack on the next data set appeared. It has been running, but no progress and scores appeared. I was thinking that it should be a certain universal-sentence-encoder. There is a problem with the loading of the model, but no error is reported there, it just keeps running without results.Image Image

zl-comment commented 2 weeks ago

There are no problems with attacks on other data sets, but there are problems with bertattack on rte and wnli data sets.

Immortalise commented 2 weeks ago

Hi, could you share the relevant problematic code snippets with me?

zl-comment commented 2 weeks ago

The difficult thing is that I don’t know in which code fragment it occurs. It occurs in the wnli and rte data sets, but in other data sets this situation of running but not producing results does not occur.

zl-comment commented 2 weeks ago

try: model_t5 = LLMModel(model='google/flan-t5-large', temperature=0.5) print("语句执行成功,模型已成功加载。") except Exception as e: print("语句执行失败,以下是错误信息:") print(str(e))

create dataset

dataset = pb.DatasetLoader.load_dataset("rte",local_path=localpathconfig.GLUE_PATH) #还有mrpc的实验 logging.info(f"数据集: rte")

确保数据集包含足够的数据

if len(dataset) >= 1000:

选择前1000条记录

validation_dataset = dataset[:1000]

else: validation_dataset=dataset

prompts = [ "Are the following two sentences entailment or not_entailment? Answer me with 'entailment' or 'not_entailment', just one word", "Analyze the two provided sentences and decide if their relationship is 'entailment' or 'not_entailment'", "Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'", "As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'", "While performing entailment analysis, classify the relationship between the provided sentences as 'entailment' or 'not_entailment':", "In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment' or 'not_entailment':" ]

question ="\nQuestion: {content}\nAnswer:" try: model_t5 = LLMModel(model='google/flan-t5-large', temperature=0.5) print("语句执行成功,模型已成功加载。") except Exception as e: print("语句执行失败,以下是错误信息:") print(str(e))

define the projection function required by the output process

def proj_func(pred): mapping = { "entailment": 0, "not_entailment": 1 } pred_lower = pred.lower() # 将输入转换为小写 if pred_lower in mapping: return mapping[pred_lower] else: logging.info(f"ERROR OUT: {pred}") # 记录到日志文件 return -1

define the evaluation function required by the attack

def eval_func(prompt, validation_dataset, model): logging.info(f"Prompt: {prompt}") # 记录到日志文件 preds = [] labels = [] for d in tqdm(validation_dataset, desc="process"):

    input_text = pb.InputProcess.basic_format(prompt+question, d)

    raw_output = model(input_text)  #是有回答的 

    output = pb.OutputProcess.cls(raw_output, proj_func)   #将输出结果映射到1 0 -1

    preds.append(output)

    labels.append(d["label"])

return pb.Eval.compute_cls_accuracy(preds, labels)

unmodifiable_words = [ "entailment\'", "not_entailment\'","content"]

print all supported attacks

print(Attack.attack_list())

def writetofile(key):

添加最终结果到文件夹

file_path = localpathconfig.ATTACK_RESULT
if not os.path.exists(file_path):
    # 如果文件不存在,则创建文件并写入数据
    with open(file_path, 'w') as file:
        file.write(f"## {key}\n")
        file.write("\n")  # 添加一个空行
else:
    # 如果文件已存在,则直接写入数据
    with open(file_path, 'a') as file:
        file.write(f"## {key}\n")
        file.write("\n")

writetofile("bertattack") logging.info(f"attack: bertattack")

依次使用每条提示进行攻击

for prompt in prompts: print(f"Using prompt: {prompt}") attack = Attack(model_t5, "bertattack", validation_dataset, prompt, eval_func, unmodifiable_words, verbose=True) print(attack.attack())

Immortalise commented 2 weeks ago

I think you could first create a minimal reproducible example by removing any non-essential code. Once you have the simplest version that still shows the problem, we can work together to solve this issue.