Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
https://github.com/Facico/Chinese-Vicuna
Apache License 2.0
4.14k stars 422 forks source link

可能是一个参数拼写错误? #175

Closed apachemycat closed 1 year ago

apachemycat commented 1 year ago

class prompt: def init(self, tokenizer, max_len, add_eos=True): self.tokenizer = tokenizer self.max_len = max_len self.add_eos=add_eos

class instruct_prompt(prompt): prompt = ( "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n" "### Instruction:\n{instruction}\n\n### Response:" ) prompt_input = ( "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n" "### Instruction:{instruction}\n\n### Input:{input}\n\n### Response:" ) prompt_history = "User:{input}\n\nAssistant:{output}\n\n" prompt_post = "User:{input}\n\nAssistant:"

def preprocess_gen(self, data_point):
    if 'history' not in data_point:
    # single instruction format {'instruction':..,'input':..}
        if 'input' in data_point:
            user_prompt = self.prompt_input.format_map(data_point)
        else:
            user_prompt = self.prompt.format_map(data_point)
    else:
    # multi turn format {'history':[..], 'input':[..]}
        user_prompt = "\n".join(["User:" + i['input']+"\n"+"Assistant:" + i['output'] for i in data_point['history']]) + "\nUser:" + data_point['input'] + "\nAssistant:"
        user_prompt = **user_prompt[-maxlen:]**

可能maxlen是误拼写?应该是self.max_len ?

Facico commented 1 year ago

感谢指正!