Open zl-comment opened 2 weeks ago
There are no problems with attacks on other data sets, but there are problems with bertattack on rte and wnli data sets.
Hi, could you share the relevant problematic code snippets with me?
The difficult thing is that I don’t know in which code fragment it occurs. It occurs in the wnli and rte data sets, but in other data sets this situation of running but not producing results does not occur.
try: model_t5 = LLMModel(model='google/flan-t5-large', temperature=0.5) print("语句执行成功,模型已成功加载。") except Exception as e: print("语句执行失败,以下是错误信息:") print(str(e))
dataset = pb.DatasetLoader.load_dataset("rte",local_path=localpathconfig.GLUE_PATH) #还有mrpc的实验 logging.info(f"数据集: rte")
if len(dataset) >= 1000:
validation_dataset = dataset[:1000]
else: validation_dataset=dataset
prompts = [ "Are the following two sentences entailment or not_entailment? Answer me with 'entailment' or 'not_entailment', just one word", "Analyze the two provided sentences and decide if their relationship is 'entailment' or 'not_entailment'", "Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'", "As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'", "While performing entailment analysis, classify the relationship between the provided sentences as 'entailment' or 'not_entailment':", "In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment' or 'not_entailment':" ]
question ="\nQuestion: {content}\nAnswer:" try: model_t5 = LLMModel(model='google/flan-t5-large', temperature=0.5) print("语句执行成功,模型已成功加载。") except Exception as e: print("语句执行失败,以下是错误信息:") print(str(e))
def proj_func(pred): mapping = { "entailment": 0, "not_entailment": 1 } pred_lower = pred.lower() # 将输入转换为小写 if pred_lower in mapping: return mapping[pred_lower] else: logging.info(f"ERROR OUT: {pred}") # 记录到日志文件 return -1
def eval_func(prompt, validation_dataset, model): logging.info(f"Prompt: {prompt}") # 记录到日志文件 preds = [] labels = [] for d in tqdm(validation_dataset, desc="process"):
input_text = pb.InputProcess.basic_format(prompt+question, d)
raw_output = model(input_text) #是有回答的
output = pb.OutputProcess.cls(raw_output, proj_func) #将输出结果映射到1 0 -1
preds.append(output)
labels.append(d["label"])
return pb.Eval.compute_cls_accuracy(preds, labels)
unmodifiable_words = [ "entailment\'", "not_entailment\'","content"]
print(Attack.attack_list())
def writetofile(key):
file_path = localpathconfig.ATTACK_RESULT
if not os.path.exists(file_path):
# 如果文件不存在,则创建文件并写入数据
with open(file_path, 'w') as file:
file.write(f"## {key}\n")
file.write("\n") # 添加一个空行
else:
# 如果文件已存在,则直接写入数据
with open(file_path, 'a') as file:
file.write(f"## {key}\n")
file.write("\n")
writetofile("bertattack") logging.info(f"attack: bertattack")
for prompt in prompts: print(f"Using prompt: {prompt}") attack = Attack(model_t5, "bertattack", validation_dataset, prompt, eval_func, unmodifiable_words, verbose=True) print(attack.attack())
I think you could first create a minimal reproducible example by removing any non-essential code. Once you have the simplest version that still shows the problem, we can work together to solve this issue.
After I performed a bertattack on the previous data set, the bertattack on the next data set appeared. It has been running, but no progress and scores appeared. I was thinking that it should be a certain universal-sentence-encoder. There is a problem with the loading of the model, but no error is reported there, it just keeps running without results.