uiuc-conversational-ai-lab / multiwoz-helper

A code repository for creating and evaluating MultiWOZ data, with support for multiple versions.
MIT License
0 stars 1 forks source link

how to generate input_file.json #1

Open emrecanacikgoz opened 5 days ago

emrecanacikgoz commented 5 days ago

Hey @SuvodipDey,

How should I generate the input_file.json to evaluate my model? I want to use a direct huggingface path like: meta-llama/Llama-3.1-8B.

Maybe @yxc-cyber got the scripts for generative LLMs?

yxc-cyber commented 4 days ago

Hi @emrecanacikgoz , Now I have updated my code and it should support local models now. You can take this offline_evaluate.ipynb. The only change you need is that in the client_config, you don't need to provide client or model. Instead, you provide local_model. This local model must have a function chat_completion(messages), where the input is like this:

messages = [
    {"role": "system", "content": "Some prompts."},
    {"role": "user", "content": "A user utterance."},
    {"role": "assistant", "content": "A response."},
    {"role": "user", "content": "A user utterance."},
]

And the output is of type str, which is the response to the last user utterance. The overall idea that you can probably take as a reference is like this. First, you can inherit an LM object to implement the chat_completion(messages) function:

from transformers import LlamaForCausalLM
class customLM(LlamaForCausalLM):
    def chat_completion(self, messages):
        #Todo
        pass

Then, you can load your checkpoint from somewhere:

pretrained_model_name_or_path = "some model your want to use"
mycustomLM = customLM.from_pretrained(pretrained_model_name_or_path)

After that, you can change change the config in offline_evaluate.ipynb and run your generation:

client_config = {
    "local_model": mycustomLM,
}
offline_evaluator = evaluator(
    data,
    old_data,
    total_samples,
    prompt=eval_prompt,
    client_config=client_config,
    online=False
)

After getting the generated result, you can use eval_postprocess.py to transform the result into the format that Suvodip's eval code accepts. But currently, eval_postprocess.py contains some issues with delexicalization. So don't directly use it.