yinyueqin / SAPO

Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment (SAPO)
Apache License 2.0
5 stars 0 forks source link

Chat template to use in Fastchat when evaluating MT-Bench #1

Closed Meaquadddd closed 3 months ago

Meaquadddd commented 3 months ago

Hi, nice work and well written paper !

I would like to ask that since you have used chatml template when training with orpo for llama3, I want to know what chat template to use hen evaluating model with MT-Bench with Fastchat library.

I have tried the following but it seems that it does not to work for me:

register_conv_template( Conversation( name="llava-chatml", system_template="<|im_start|>system\n{system_message}", system_message="Answer the questions.", roles=("<|im_start|>user", "<|im_start|>assistant"), sep_style=SeparatorStyle.CHATML, sep="<|im_end|>", stop_str="<|im_end|>", ) )

Many thanks for the assistance.

yinyueqin commented 3 months ago

Hi, thanks for your interest in our work!

You can try this:


class Llama3SAPOAdapter(BaseModelAdapter):
    use_fast_tokenizer = False

    def match(self, model_path: str):
        return "llama3-8b" in model_path.lower()

    def load_model(self, model_path: str, from_pretrained_kwargs: dict):
        revision = from_pretrained_kwargs.get("revision", "main")
        tokenizer = AutoTokenizer.from_pretrained(
            model_path, revision=revision
        )
        model = AutoModelForCausalLM.from_pretrained(
            model_path,
            low_cpu_mem_usage=True,
            **from_pretrained_kwargs,
        ).eval()
        return model, tokenizer

    def get_default_conv_template(self, model_path: str) -> Conversation:
        return get_conv_template("llama3-8b-sapo")
   # llama-3-8b-chatml
register_conv_template(
    Conversation(
        name="llama3-8b-sapo",
        roles=("<|im_start|>user", "<|im_start|>assistant"),
        sep_style=SeparatorStyle.CHATML,
        sep="<|im_end|>",
        stop_token_ids=[128256,128257],
        stop_str="<|im_end|>",
    )
)
Meaquadddd commented 3 months ago

Thank you !!