microsoft / terminal

The new Windows Terminal and the original Windows console host, all in the same place!
MIT License
95.93k stars 8.35k forks source link

[Terminal Chat] The LLM receives confusing information about shell #18142

Open vbrozik opened 3 weeks ago

vbrozik commented 3 weeks ago

Windows Terminal version

1.23.3061.0

Windows build number

10.0.22631.0

Other Software

Steps to reproduce

  1. Use the default Ubuntu distribution in WSL.
  2. Open a terminal for it.
  3. Start a Terminal Chat.
  4. After few interactions it could mention that it has information that I am using a shell named ubuntu.exe.
  5. It seems that a sequence of these two user messages lead to responses like below: Hello, Please explain the error.

Image

Expected Behavior

The chatbot should not behave is if it received wrong information about the shell being used. It should either have information that I am using Ubuntu distribution of Linux (and maybe version) or that I am using Bash shell in Ubuntu Linux.

It should probably be allowed to modify parts of the initial prompt including information about the shell and operating system. This setting should be per profile.

Actual Behavior

The Terminal Chat probably receives confusing information about the shell in the terminal window in its preconfigured initial prompt.

When I am using the default Ubuntu WSL distribution with the default bash shell the GitHub Copilot responds as if it received information that I am using shell named ubuntu.exe:

Image

With Ubuntu 24.04 GitHub Copilot is confused even more:

Image

zcobol commented 3 weeks ago

@vbrozik LLM hallucination 😀

vbrozik commented 3 weeks ago

@zcobol we can see the hallucination in the second case: ubuntu2404.exe vs "Ubuntu 20.04"

...but I think we can be almost sure that the LLM receives information about the command used to start the WSL2 session like ubuntu.exe or ubuntu2404.exe and it is being wrongly presented to it as a shell running inside the terminal.

DHowett commented 3 weeks ago

This is one of the annoying limitations we've run into - Terminal actually cannot know what is running "behind" the single executable that it started (in this case, ubuntu.exe or ubuntu2404.exe). There's no interface by which this data is communicated, so it makes a "best guess" and inserts the name of the root process of the tab.

It's not ideal.

vbrozik commented 3 weeks ago

Yes, I expected that Terminal lacking the information was the cause. I think that presenting ubuntu.exe to LLMs as a shell could cause bad experience for the users. I really do not know what is optimal to present to LLMs but I would certainly consider one or more of following solutions:

PankajBhojwani commented 1 week ago

Those are some solid suggestions, thank you! We actually do have a work item for allowing the user to modify/add to the system prompt that we send to the LLM, once that is implemented we can remove the 'default' addition that we do right now entirely and just use the user-provided one, which should fix this :)