microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
20.48k stars 2.97k forks source link

.Net: Implementing a Resilient JSON Parser in the FunctionResult Class #4444

Open mehrandvd opened 6 months ago

mehrandvd commented 6 months ago

Problem Statement

Developers who work with OpenAI prompts often need to return a JSON object as the result, for example:

PROMPT:

Get the intent of the input and the food type.
Return the result as a valid JSON like:
{
    "intent": "BuyFood",
    "food": "Pizza"
}

RESULT:

However, sometimes OpenAI does not return a valid JSON, but adds some extra markdown elements such as code blocks or syntax highlighting, for example:

{ "intent": "BuyFood", "reason": "Pizza" }

or

```json
{
    "intent": "BuyFood",
    "reason": "Pizza"
}


This can cause problems when parsing the JSON result using standard methods. Therefore, we need a more robust and flexible way to parse JSON strings that can handle these variations.

## Proposed Solution
I have already developed a simple parser that is more tolerant to these types of results. It is called `PowerParseJson()` and it is part of the [SemanticValidation](https://github.com/mehrandvd/SemanticValidation/blob/main/src/SemanticValidation/Utils/SemanticUtils.cs) library.

However, I think the best place for this parser is in the `FunctionResult` class, so that it can be easily used by any developer who needs to parse JSON results from OpenAI prompts.

I propose the introduction of a new method, `FunctionResult.ParseJson<T>`, designed to parse JSON strings into objects of type `T` with enhanced tolerance for variations in input.

I am willing to implement this feature myself and add unit tests for it.
dmytrostruk commented 6 months ago

@mehrandvd Thanks for your proposal. Personally, I'm not sure this method should be part of FunctionResult class, it sounds more like extension to it. You can add this extension on your side and then use FunctionResult.ParseJson<T> without any problems.

I'm not sure it should be added to SDK out-of-the-box, since this problem can occur not only for OpenAI connector, but for other connectors as well, and in case of other connectors - response in JSON may contain other extra symbols as well, so it could be hard to cover all cases. Also, in OpenAI case, resiliency can be fixed in other way, for example by setting temperature as 0 or specifying in prompt explicitly to avoid markdown or any other symbols except JSON.

But we can keep this issue open for now to see, if other developers have the same problem, and in case it's common, we may add this method in the future. Thank you!

mehrandvd commented 6 months ago

I agree that FunctionResult may not be the best fit for this situation, given its potential for broader application. This functionality could be more appropriately introduced via extensions.

Regarding the proposed solution:

In the context of OpenAI, resilience could be addressed differently, such as by setting the temperature to 0 or explicitly specifying in the prompt to exclude markdown or any symbols other than JSON.

Despite my best efforts to craft a prompt that consistently returns JSON, and with the temperature definitively set to 0, there are instances in high-load runs where it may still return an invalid response. Therefore, to ensure the resilience of the product, I had to manage this outside of the prompt.

dmytrostruk commented 6 months ago

Despite my best efforts to craft a prompt that consistently returns JSON, and with the temperature definitively set to 0, there are instances in high-load runs where it may still return an invalid response. Therefore, to ensure the resilience of the product, I had to manage this outside of the prompt.

That makes sense, thank you. While you can add JSON parsing on your side as extension, let's keep this issue open for now and see if this is common problem, that should be resolved on Semantic Kernel level. Thanks again!

markwallace-microsoft commented 6 months ago

@mehrandvd OpenAI supports JSON mode, see: https://platform.openai.com/docs/guides/text-generation/json-mode.

We have a PR open to add support for this to SK, see: https://github.com/microsoft/semantic-kernel/pull/4391

This should be included in our next release of the Semantic Kernel (within the next week).

matthewbolanos commented 6 months ago

It looks like #4391 helps guarantee that JSON is returned, but it might not guarantee that the schema is accurate. We'll keep this on our backlog until we can get something similar to TypeChat baked into Semantic Kernel.

mehrandvd commented 6 months ago

@mehrandvd OpenAI supports JSON mode, see: https://platform.openai.com/docs/guides/text-generation/json-mode.

We have a PR open to add support for this to SK, see: #4391

This should be included in our next release of the Semantic Kernel (within the next week).

Hmmm... This is exactly what I needed, a way to make sure the result is valid and parsable JSON. Will wait for the next release to check it out :)

mehrandvd commented 6 months ago

Given that we now have a designated method to inform SK that the prompt will return a JSON, it would be beneficial to have a corresponding method to retrieve it. This could be achieved by extending the GetValue<T> function to accommodate JsonObject or POCOs such as Order. This would allow usage in the following manner:

var order = result.GetValue<Order>();
var json = result.GetValue<JsonObject>();

Alternatively, if GetValue<T>() is solely responsible for type casting and does not perform any parsing, introducing a separate method named Deserialize<T> could be a viable option.

jessejiang0214 commented 3 months ago

Hi Team,

Any update on this? I try to use var order = result.GetValue<Order>(); but throw exception.

Thanks Jesse