continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
Apache License 2.0
18.99k stars 1.62k forks source link

Non-ASCII response from Gemini is partially broken in VScode extension #1492

Closed reosablo closed 4 months ago

reosablo commented 4 months ago

Before submitting your bug report

Relevant environment info

- OS: Windows 11 23H2
- Continue: v0.8.40
- IDE: VSCode 1.91.0-insider
- Model: gemini-1.5-pro-latest


A description of the bug

I'm encountering an issue where responses from Gemini with non-ASCII characters are garbled. This doesn't seem to happen with responses from Groq.

What you expected to happen

Non-ASCII character responses should be displayed correctly, without any garbled characters or replacement characters (like "�").

What actually happened

Currently, non-ASCII characters are being replaced with the replacement character "�". This happens consistently.

For example, the following response is affected:

zh-cn.ts���ァイルの一部ですね。 これは中国語の簡体字で������れ��コードで、音声操作に関するUIのテキストのようです。

日本語のメッセージにする場合、どのような文脈で表示されるかを考慮する必要があります。 例えば、以下のように状況を想定して、より自然で適切な日本語���を検討できます。

Screenshots or videos


Possible solutions

I suspect this is because the buffer is being treated as a string, rather than an ArrayBuffer. Since Gemini responses may contain incomplete Unicode bytes, using a string buffer could be causing the corruption.

solution 1: change buffer from string to ArrayBuffer in streamChatGemini function

solution 2: use TextDecoderStream instead of TextDecoder in streamResponse function.

yield* response.body.pipeThrough(new TextDecoderStream());

To reproduce

Ask some questions in the chat panel in Japanese.

Log output

No output during chat response.
reosablo commented 4 months ago

I tried TextDecoderStream and it seems to work fine.

I'll create PR.

// core/llm/stream.ts
export async function* streamResponse(
  response: Response,
): AsyncGenerator<string> {
  if (response.status !== 200) {
    throw new Error(await response.text());

  if (!response.body) {
    throw new Error("No response body returned.");

  // `response` doesn't seem to be an instance of globalThis.Response and
  // TypeScript doesn't seem to know ReadableStream has `from` method.
  const stream = (ReadableStream as any).from(response.body);

  // The type of stream is any, not ReadableStream.
  // So we don't need "DOM.AsyncIterable" lib for this line.
  yield* stream.pipeThrough(new TextDecoderStream());
sestinj commented 4 months ago

Thanks for the PR @reosablo !