Open Chongyu-hub opened 5 months ago
if model == "gpt" and "turbo" in self.args.llm_type: if "turbo" in self.args.llm_type: data_with_prompt = test_src_sample
请问为什么在evaloator.py中的create_dataset函数中在调用turbo模型时直接令data_with_prompt = test_src_sample而不需要做处理呢
因为 turbo 的访问方式与 davinci 不一致,与 turbo 的交互当中,可以把需要的指令放在初始化的 message_list 里,比如 messages_list = [{"role": "system", "content": "Classify the text."}]
,可以在这里指定:(https://github.com/beeevita/EvoPrompt/blob/c1b50865ee7247be9f209decc483e0beff139a25/llm_client.py#L37)
但是在evaloator.py中的get_generations函数中调用llm_query时并没有传入prompt_pre这个参数 ` else:#turbo for data in tqdm(dataset): pred = llm_query( data, client=self.client, type=args.llm_type, task=True, **self.llm_config, )
hypos.append(pred)`
这是输入: [{'role': 'user', 'content': 'Note that two storms on record, Hurricane Alice from the 1954 season and Tropical Storm Zeta from the 2005 season have formed during December and lasted into January.'}] 这是输出: Hurricane Alice, which formed in December 1954, is the only known hurricane on record to have crossed over from one calendar year to the next. It originated in the eastern Atlantic Ocean on December 30, 1954, and intensified into a hurricane on January 2, 1955. It continued to exist until January 6, 1955, before dissipating in the central Atlantic.
Tropical Storm Zeta, which formed in December 2005, is the most recent known storm to have crossed over from December into January. It developed in the eastern Atlantic Ocean on December 30, 2005, and became a tropical storm on January 2, 2006. It remained as a tropical storm until January 6, 2006, when it dissipated in the eastern Atlantic. 好像没有给模型传入prompt_pre
没有找到prompt_pre是怎么传给turbo的😢
但是在evaloator.py中的get_generations函数中调用llm_query时并没有传入prompt_pre这个参数
else:#turbo for data in tqdm(dataset): pred = llm_query( data, client=self.client, type=args.llm_type, task=True, **self.llm_config, ) # print(pred) hypos.append(pred)
这是输入: [{'role': 'user', 'content': 'Note that two storms on record, Hurricane Alice from the 1954 season and Tropical Storm Zeta from the 2005 season have formed during December and lasted into January.'}] 这是输出: Hurricane Alice, which formed in December 1954, is the only known hurricane on record to have crossed over from one calendar year to the next. It originated in the eastern Atlantic Ocean on December 30, 1954, and intensified into a hurricane on January 2, 1955. It continued to exist until January 6, 1955, before dissipating in the central Atlantic.Tropical Storm Zeta, which formed in December 2005, is the most recent known storm to have crossed over from December into January. It developed in the eastern Atlantic Ocean on December 30, 2005, and became a tropical storm on January 2, 2006. It remained as a tropical storm until January 6, 2006, when it dissipated in the eastern Atlantic. 好像没有给模型传入prompt_pre
这些是我传入的参数: `--seed
5
--dataset
asset
--task
sim
--batch-size
20
--prompt-num
0
--sample_num
100
--language_model
gpt
--budget
10
--popsize
10
--position
pre
--evo_mode
de
--llm_type
turbo
--initial
all
--initial_mode
para_topk
--template
v1
--cache_path
data/sim/asset/seed5/prompts_gpt.json
--output
outputs/sim/asset/gpt/all/de/bd10_top10_topk_para_init/v1/davinci/seed5`
Hello, Chongyu! I just go through the code and hope this can solve your issue.
In the evaloator.py file:
"prompt_pre" has been passed to the object "dataset".
And you can see from the figure above that "data" is iteratively sampled from "dataset". So the data may include the "prompt_pre" information. I have not checked carefully but it should work.
我调试了代码,在给turbo模型准备数据时,传入prompt_pre并没有发挥作用 我将create_dataset函数中的代码更改为这样:
if "turbo" in self.args.llm_type:
#data_with_prompt = test_src_sample
for test_src_line in test_src_sample:
prompts = []
example = format_template(
src=test_src_line,
src_name=src_name,
tgt_name=tgt_name,
template=self.template,
)
instruction_part = self.instruction_placeholder.replace(
"<prompt>", prompt_pre
)
if position in ["pre", "demon"]: # demon includes instruction + demon
if "alpaca" in self.args.language_model:
prompts.append(instruction_part + "\n\n" + example)
else:
prompts.append(
instruction_part + "\n" + demonstrations + example
)
elif position == "icl": # no instruction
example = instruction_part + "\n" + demonstrations + example
prompts.append(example)
data_with_prompt.append("\n\n".join(prompts))
得到了正确的结果
if model == "gpt" and "turbo" in self.args.llm_type: if "turbo" in self.args.llm_type: data_with_prompt = test_src_sample
请问为什么在evaloator.py中的create_dataset函数中在调用turbo模型时直接令data_with_prompt = test_src_sample而不需要做处理呢