microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
22.07k stars 3.29k forks source link

.Net: Bug: [dotnet] Handlebars plan invocation will html-encode any "unsafe" string without the possibility to control this behavior #9782

Open kandraos opened 3 days ago

kandraos commented 3 days ago

Hello SK community!

I’m encountering an issue where JSON strings produced by trusted kernel functions while being executed within a handlebars plan invocation are being unnecessarily HTML-encoded during plan execution.

Please find below a description of the issue I’m facing. This example has been simplified for illustration purposes but represents the real issue well.

I have a kernel that contains the following 3 Kernel Functions:

I also set up a function filter to examine the results of each step of the workflow. The results are shown below:

Image

By looking at the results, we notice the following: • The result of step 1 contains the JSON string as we would expect it • The result of step 2 contains the JSON string, with the unsafe html characters escaped ( “ -> ") • The result of the workflow contains the JSON string, with the unsafe html characters escaped again (" -> ")

After digging a bit deeper, it seems like when the output of a function gets added to the KernelArguments object associated with the workflow function invocation (that grows as the handlebars plan execution advances), it is automatically encoded.

I’ve come across the following blog post: https://devblogs.microsoft.com/semantic-kernel/protecting-against-prompt-injection-attacks-in-chat-prompts/ This post mentions a way to “Allow dangerously set content” for chat prompts. It seems like this post is the only documentation about unsafe characters that I could find online, and it does not address the specific issue I’m observing here.

While we understand the need to encode unsafe strings, we would like to “allow” certain content to contain unsafe characters. In this example, the JSON string is produced by a “trusted” function, and we do not want it to be encoded. Are there any parameters we can set to prevent this behavior?

I’ve attached a simple example program using C# that illustrates this example. I had to convert the .cs file to .txt to be able to attach it, but this one file should be enough to represent the issue. Please feel free to reach out if you need any more clarification on this issue I’m facing.

Program.txt

Thanks!