GeneZC / MiniMA

Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
Apache License 2.0
96 stars 5 forks source link

Inconsistent response from interactive MiniChat-3B #5

Closed rsong0606 closed 7 months ago

rsong0606 commented 10 months ago

Hi, happy new year!!

Good work, first of all!! I am trying to use MiniChat-3B as an interactive Chatbot in my application. However, the response from the model either returns

  1. ? or nothing
  2. I am a language model, I do not have feelings blah blah
  3. Irrevelant stuff (Sometimes the first response is perfect, but after the first response, it's getting out of control)

It might be my misuse of this model, but I will show briefly my use case. I have a custom prompt, treated as a scenario(I defined upfront), this will be used as the starting prompt + prompt from Minichat for the model. And user can interact with the bot and receive a response.

I have a few questions

  1. Do we have a mechnism to store message history(both user and assistant currently it only saves the last one only?), so the model will output consistent response
  2. How to send message history as input to the model to return the response to receive a consistent response
  3. Is it feasible to use Minichat as a chatbot?
  4. How we can control the response flow to make sure it's getting consistent responses?
  5. How we can reduce the response like I am a bot etc.
  6. What is the best practice for prompt engineering for minichat?

Here's my sample usage based on your sample code which returns inconsistent response ` def generate_response(self, user_input, main_topic, subtopic):

Retrieve and print the system prompt

    system_prompt = self.get_prompt(main_topic, subtopic)
    print("System Prompt:", system_prompt)
    if system_prompt is None:
        return "Prompt not found for the given topic and subtopic.", None

    # Append user input to the conversation history and print it
    self.conv.append_message(self.conv.roles[0], user_input)
    print("Appended User Input:", user_input)

    # Generate and print the conversation history prompt
    conversation_prompt = self.conv.get_prompt()
    print("Conversation History Prompt:", conversation_prompt)

    # Combine the system prompt with the conversation history and print the combined prompt
    combined_prompt = system_prompt + "\n" + conversation_prompt
    print("Combined Prompt for Model:", combined_prompt)

    # Generate model input IDs
    input_ids = self.tokenizer([combined_prompt]).input_ids

    # Generate output from the model
    output_ids = self.model.generate(
        torch.as_tensor(input_ids).cuda(),
        do_sample=True,
        temperature=0.7,
        max_new_tokens=50,
    )

    # Decode and print the chatbot's response
    output_ids = output_ids[0][len(input_ids[0]):]
    response = self.tokenizer.decode(output_ids, skip_special_tokens=True).strip()
    print("Chatbot Response:", response)

    # Append the chatbot's response to the conversation history
    self.conv.append_message(self.conv.roles[1], response)

    return response`

Please take a look if you have time

Thanks a lot!

GeneZC commented 10 months ago

Thanks for your interests.

In your use case, the conversation_prompt already includes a system prompt defined by us.

Please consider appending your own system prompt after our system prompt, or maybe directly putting your system prompt in the input.

And below are answers to your questions:

  1. yes, please refer to the logics of conv.get_prompt().
  2. same as 1.
  3. yes. reach me later if you have further detailed questions.
GeneZC commented 7 months ago

I am closing this issue due to inactivity. Reopen it as you like.