microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.53k stars 3.18k forks source link

.Net: FunctionCallingStepwisePlanner failing with Error 400 #4556

Closed arajawat closed 8 months ago

arajawat commented 8 months ago

Describe the bug Using "FunctionCallingStepwisePlanner" is throwing error while calling ExecuteAsync. The error response looks like this - { "message": "Invalid value for 'content': expected a string, got null.", "type": "invalid_request_error", "param": "messages.[18].content", "code": null }

The ExecuteAsync call was working normally atleast till 08Jan'24 but stopped working on 9th. I'm still able to create and invoke the plan using HandleBarsPlanner correctly and use kernel.InvokeAsync as well.

To Reproduce Steps to reproduce the behavior:

  1. Use following configuration nuget - Microsoft.SemanticKernel.Planners.OpenAI version: 1.0.1-preview Model name: gpt-4 Model version: 1106-Preview
  2. Create planner using var planner=new FunctionCallingStepwisePlanner();
  3. Call Plan execution uing var result = await planner.ExecuteAsync(kernel, planPrompt);
  4. Following error response is now getting returned:
      Status: 400 (model_error)

      Content:
      {
  "error": {
    "message": "Invalid value for 'content': expected a string, got null.",
    "type": "invalid_request_error",
    "param": "messages.[18].content",
    "code": null
  }
}

      Headers:
      Access-Control-Allow-Origin: REDACTED
      X-Content-Type-Options: REDACTED
      x-ratelimit-remaining-requests: REDACTED
      apim-request-id: REDACTED
      x-ratelimit-remaining-tokens: REDACTED
      X-Request-ID: REDACTED
      ms-azureml-model-error-reason: REDACTED
      ms-azureml-model-error-statuscode: REDACTED
      x-ms-client-request-id: 0dbca26b-ed41-4e83-a4ef-05b86c642978
      x-ms-region: REDACTED
      azureml-model-session: REDACTED
      Strict-Transport-Security: REDACTED
      Date: Wed, 10 Jan 2024 15:16:23 GMT
      Content-Length: 189
      Content-Type: application/json

         at Azure.Core.HttpPipelineExtensions.ProcessMessageAsync(HttpPipeline pipeline, HttpMessage message, RequestContext requestContext, CancellationToken cancellationToken)
         at Azure.AI.OpenAI.OpenAIClient.GetChatCompletionsAsync(ChatCompletionsOptions chatCompletionsOptions, CancellationToken cancellationToken)
         at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.RunRequestAsync[T](Func`1 request)
         --- End of inner exception stack trace ---
         at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.RunRequestAsync[T](Func`1 request)
         at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.GetChatMessageContentsAsync(ChatHistory chat, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)
         at Microsoft.SemanticKernel.ChatCompletion.ChatCompletionServiceExtensions.GetChatMessageContentAsync(IChatCompletionService chatCompletionService, ChatHistory chatHistory, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)
         at Microsoft.SemanticKernel.Planning.FunctionCallingStepwisePlanner.GetCompletionWithFunctionsAsync(ChatHistory chatHistory, Kernel kernel, IChatCompletionService chatCompletion, OpenAIPromptExecutionSettings openAIExecutionSettings, ILogger logger, CancellationToken cancellationToken)
         at Microsoft.SemanticKernel.Planning.FunctionCallingStepwisePlanner.ExecuteCoreAsync(Kernel kernel, String question, CancellationToken cancellationToken)
         at Microsoft.SemanticKernel.Planning.PlannerInstrumentation.InvokePlanAsync[TPlan,TPlanInput,TPlanResult](Func`5 InvokePlanAsync, TPlan plan, Kernel kernel, TPlanInput input, ILogger logger, CancellationToken cancellationToken)

Expected behavior Expected a properly formed response JSON.

Screenshots

Platform

Additional context The issue suddenly started happening while planner was working pretty fine for last few weeks.

matthewbolanos commented 8 months ago

@arajawat, is the model being served up by OpenAI or Azure OpenAI? Also.. are you suddently getting a 400 error for other types of prompts that you use with your model? Or is it specific to the planner?

@markwallace-microsoft or @alliscode, are either of y'all aware of a change in the endpoints of OpenAI or Azure OpenAI that would cause this to suddenly break?

arajawat commented 8 months ago

@matthewbolanos We're using Azure OpenAI. It is specific to only FunctionCallingStepwisePlanner only. I am able to call HandleBarsPlanner and get the desired result and I am also able to directly call kernel, pass the prompt and get the result.

arajawat commented 8 months ago

@matthewbolanos The issue seems to be with Azure OpenAI deployment for models "gpt-4" & "gpt-4-32k". We are not getting any error "gpt-35-turbo-16k" and everything works fine with this.

The above mentioned error is coming from model version - 1106-Preview

For other model version "0613" of gpt-4 and gpt-4-32k, we get some other error stated below: { "error": { "message": "Unrecognized request arguments supplied: tool_choice, tools", "type": "invalid_request_error", "param": null, "code": null } }

KSemenenko commented 8 months ago

I can confirm, there is a issue in version 1.0.1 but I just download sourse code, and seems now gpt4-turbo works fine.

gitri-ms commented 8 months ago

There have been some changes to ClientCore.cs in the past few days, so I'm wondering if there's a connection there. I have not been able to reproduce this issue myself on the latest code.

@arajawat @KSemenenko I would love to get a repro of this and attach a breakpoint to see what the chat history looks like at the time/just before the exception is thrown. If either of you have a set of reliable repro steps that you can share, that would be really helpful!

Edit: I was using gpt-35-turbo-16k, which doesn't repro the issue. I'll give it a try with one of the other models.

gitri-ms commented 8 months ago

I am now able to reproduce the issue. Investigating....

gitri-ms commented 8 months ago

This appears to be an issue with the model on Azure OpenAI service, not in our code.

I can see that the model is returning a response with finish_reason equal to tool_calls, but the list of tool calls is empty and the content string is null (as it should be for a tool call). When we add this message back into chat history and request another completion, the model identifies this entry in the chat history as invalid, since neither the tool call or content are populated.

At this time, the only workaround I can suggest is to use a different model version -- this seems to be an issue with the 1106 models.

Here's a related thread on the OpenAI community forum: https://community.openai.com/t/function-call-response-is-empty-despite-completion-tokens-being-used/580888

KSemenenko commented 8 months ago

Sorry, forgot to mentioned, for me it was Azure OpenAI

arajawat commented 8 months ago

@gitri-ms Thanks for the analysis and sharing associated thread. Apparently there are 2 different issues with both available versions:

Issue with 1106 preview (& you provided your research insights above) : Invalid value for 'content': expected a string, got null. Issue with 0613 : Unrecognized request arguments supplied: tool_choice, tools

So even the other "0613" model version is throwing error. For now, the only work around is stepping back to gpt35-turbo-16k.

takeo-iw commented 8 months ago

Is this error depending on the Azure region? This error occurs in East US 2 region, but works in Sweden Central region with 0613.

gitri-ms commented 8 months ago

Dupe of #4674 which has been resolved