Python: Chat service will always put the first message as system message, how to handle it with text completion?

HuskyDanny commented 3 months ago

Describe the bug GPT4 has no text completion capability and I used it for single text completion at previous version, which was working fine. For the new beta version, the chat_history will always put the first message as system role instead of user role.

To Reproduce

            kernel.add_service(
                AzureChatCompletion(
                    service_id="azure_openai_chat4_service",
                    deployment_name=chat4_api_config.deployment_model_id,
                    ad_token_provider=get_bearer_token_provider(
                        DefaultAzureCredential(),
                        "https://cognitiveservices.azure.com/.default",
                    ),
                    endpoint=chat4_api_config.endpoint,
                )

        execution_settings = AzureChatPromptExecutionSettings(
            service_id="azure_openai_chat4_service",
        )
        query_expansion = self.get_semantic_function(kernel, "skills", "QueryExpansion")
        result = await query_expansion.invoke(
            kernel=kernel, input=condensed_question, settings=execution_settings
        )

Expected behavior The first message should be user role. Or we should be able to configure the system message in config.json or skprompt.txt.

Screenshots If applicable, add screenshots to help explain your problem.

Platform OS: Windows IDE: VS Code Language: Python Source: semantic-kernel==0.9.2b1

Additional context Add any other context about the problem here.

HuskyDanny commented 3 months ago

Right now I figured a work arounds for single prompt shot is to add a dummy user message and put the {{$chat_history}} in between of system message and the rest in the prompt: Wondering if this is the only way to do it? It adds extra redundancy

moonbox3 commented 3 months ago

Hi @HuskyDanny, I am working on responding to your other post in the Q&A section, but wanted to respond to this here. Can you help me understand something important: the title of this issue says you want to handle a system message for text completion? In your code snippet you're create an AI service for AzureChatCompletion and not AzureTextCompletion. Is that intentional? Text completion does not deal with chat history, it takes a prompt only.

HuskyDanny commented 3 months ago

Hi @moonbox3, thanks for replying. So I am using gpt4 and gpt3.5turbo which only supports chat completion. But I have some functionalities like summary or query rewrite which will use these models like text completion. If I go to text completion class, the api will return not supported error. So I could only go with chat completion for these 2 models.

And I see the first message will be automatically set as system role, so I intentionally add chat_history and add one dummy user message, so that the chat_history will divide the prompt into system + user message and roles will be assigned correctly.

To illustrate:

Prompt: Instruction: You are a summarizer to summarize the below input ---> system message {{$chat_history}} --->Add dummy user message Input: {{$input}} ---> user message

moonbox3 commented 3 months ago

Thanks for your reply, @HuskyDanny. Can you please try 0.9.4b1? We made the update to have the simple prompt (without a chat history) to have the user role instead of the system role.

HuskyDanny commented 3 months ago

Hey @moonbox3, this is awesome to react this quick to fix it. I am trying with the new version. There are some breaking changes, I used to get over with it by the blog showing the summary and examples of these breaking changes. I am wondering if there is such one for this change? And what is the general recommendation for us to deal with the breaking changes? (Like dedicated blog section for breaking changes? etc..)

HuskyDanny commented 3 months ago

With the new version, I could not intentionally add system message. Both system and user message will be combined as a single user message:

The root cause seems to be the from_rendered_prompt of chat_history, where it failed to parse the XML prompt. Should we have a XML invalid or special token removal so that the retrieved content will not interfere the flow?

moonbox3 commented 3 months ago

Hey @moonbox3, this is awesome to react this quick to fix it. I am trying with the new version. There are some breaking changes, I used to get over with it by the blog showing the summary and examples of these breaking changes. I am wondering if there is such one for this change? And what is the general recommendation for us to deal with the breaking changes? (Like dedicated blog section for breaking changes? etc..)

Since we're still in beta, there may be further breaking changes until we get to v1.0 in May. We're working fast to try and get the breaking changes out of the way ASAP; however, things are still in flux before we get to v1, when we don't want any breaking changes. We're putting more focus on code velocity right now, and we're doing our best to document larger changes as we go. We are working on updating the docs, and when we land at v1, the docs will reflect a healthy state.

moonbox3 commented 3 months ago

With the new version, I could not intentionally add system message. Both system and user message will be combined as a single user message:

The root cause seems to be the from_rendered_prompt of chat_history, where it failed to parse the XML prompt. Should we have a XML invalid or special token removal so that the retrieved content will not interfere the flow?

You can use XML as you see to generate a prompt with a system tag. You can use a chat history that can be explicitly set with a system message, like chat_history = ChatHistory(system_message="<some message>") or chat_history.add_system_message("").

Could you help me understand why couldn't structure your prompt in the following way (as an example?

chat_function = kernel.create_function_from_prompt(
    prompt="""{{$chat_history}}{{$user_input}}""",
    function_name="chat",
    plugin_name="chat",
    prompt_execution_settings=req_settings,
)

history = ChatHistory()
history.add_system_message("You are a summarizer to summarize the below input.")

answer = await kernel.invoke(
        chat_function,
        user_input=user_input,
        chat_history=history,
    )

This gets parsed into two distinct messages that will be sent to the model -- one system and one user.

HuskyDanny commented 3 months ago

@moonbox3 Yes, I did structure this way, in my case, one more piece is the CONTEXTS, which pulls the data from vector db. So the prompt is like """{{$chat_history}}{{$contexts}}{{$user_input}}""". And I have confirmed the contexts's some special character could break the XML parse, like adding the extra "<>" in the content.

So by leveraging XML parse library, it introduces a point of failure especially for RAG system where the retrieval might have special characters for XML. Will the semantic kernel consider handling it?

moonbox3 commented 3 months ago

Thanks for your reply, @HuskyDanny. Tagging @eavanvalkenburg so we can chat about this.

moonbox3 commented 3 weeks ago

@HuskyDanny are you still having issues with this?

microsoft / semantic-kernel

Python: Chat service will always put the first message as system message, how to handle it with text completion? #5547