Continue adding spaces to code

JamesAllerlei commented 7 months ago

Before submitting your bug report

[X] I believe this is a bug. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that reports the same bug
[X] I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS: macOS 12.7.3
- Continue:v0.8.25
- IDE:VS Code

Description

In the continue chat window, all code and text has extra random spaces that I have to remove manually for the code to work. Eg, it will suggest code with extra spaces and write spaces into my code when paraphrasing, eg, things like "start row = 34 # Enter the star ting row", where an extra space has been added before "" and in the comment. They appear in variables, comments, file names, operators, anywhere and everywhere, about 1 per 10 lines, and break the code. Continue is otherwise working great so it is worth it, despite this limitation and the slowdown.

As an aside, is it possible to add a system prompt at the start of a chat, eg so the model knows basic context and preferences such as IDE, programming language, formatting preferences for the code? It looks like I could set up a slash command but great if this was automated.

To reproduce

The problem does not occur when using the built in selection of free trial models, But it is everpresent when using any of the models I have added to the config file and use via api. I am mostly using Gemini 1.5 and Gemini 1.0 but have tested with mistral and llama too with the same error. Here is another example (extra space before "_"): def log(text): with open('output _log.txt', "a") as f: f.write(text)

Log output

No recent errors in log console.

sestinj commented 7 months ago

@JamesAllerlei this was recently solved in the latest pre-release (0.9.x). It will be published to a main release (0.8.26) today

JamesAllerlei commented 7 months ago

Awesome! Thanks. Continue rocks :)

Zhangtiande commented 7 months ago

It seems that there is still this problem

sestinj commented 6 months ago

@Zhangtiande Can you share what your config.json looks like? I'm thinking this might be a different problem, more related to tab autocomplete

coder543 commented 6 months ago

I'm seeing the exact same extra-space issue some of the time in the few minutes that I've spent testing the extension in PyCharm, and the extra space before the suggestion will be added if I tab accept it, which breaks Python code.

It also has a habit of duplicating suggestions.

Extra space + duplicated suggestion:

Just a duplicated suggestion:

if I tab accept, it only inserts one, instead of inserting both copies, but the visual bug is unpleasant.

I'm using ollama to provide the suggestions, and the extra space occurs on both PyCharm and VS Code. The duplicate suggestion only occurs on PyCharm.

I should probably open a separate issue for the duplicate suggestion, but I figured I would mention it here since people were actively discussing the extra-space issue.

Zhangtiande commented 6 months ago

@Zhangtiande Can you share what your config.json looks like? I'm thinking this might be a different problem, more related to tab autocomplete@Zhangtiande 你能分享一下你的 config.json 是什么样子的吗？我认为这可能是一个不同的问题，与选项卡自动完成更相关

this is my config:

{
  "models": [
    {
      "model": "Qwen",
      "title": "Qwen-32B-Chat",
      "apiBase": "http://ip:9997/v1",
      "contextLength": 32768,
      "completionOptions": {
        "temperature": 0.8
      },
      "provider": "openai",
      "apiKey": "1111"
    },
    {
      "model": "codeqwen",
      "title": "codeqwen:7B",
      "apiBase": "http://ip:11434",
      "contextLength": 65536,
      "provider": "ollama"
    },
    {
      "model": "llama3",
      "title": "llama3:8B",
      "apiBase": "http://ip:11434",
      "contextLength": 8096,
      "provider": "ollama"
    }
  ],
  "contextProviders": [
    {
      "name": "code"
    },
    {
      "name": "tree"
    },
    {
      "name": "search"
    },
    {
      "name": "outline"
    },
    {
      "name": "diff",
      "params": {}
    },
    {
      "name": "open",
      "params": {}
    },
    {
      "name": "terminal",
      "params": {}
    },
    {
      "name": "problems",
      "params": {}
    },
    {
      "name": "docs",
      "params": {}
    }
  ],
  "slashCommands": [
    {
      "name": "edit",
      "description": "Edit highlighted code"
    }
  ],
  "allowAnonymousTelemetry": false,
  "tabAutocompleteModel": {
    "title": "Tab Autocomplete Model",
    "provider": "ollama",
    "model": "codeqwen",
    "apiBase": "http://ip:11434"
  },
  "embeddingsProvider": {
    "provider": "ollama",
    "model": "codeqwen",
    "apiBase": "http://ip:11434"
  },
  "disableIndexing": true
}

JamesAllerlei commented 6 months ago

I'm also getting the duplicate suggestions issue. Here is my config file, if it helps.:

{ "models": [ { "title": "GPT-4 Vision (Free Trial)", "provider": "free-trial", "model": "gpt-4-vision-preview" }, { "title": "GPT-3.5-Turbo (Free Trial)", "provider": "free-trial", "model": "gpt-3.5-turbo" }, { "title": "Gemini Pro (Free Trial)", "provider": "free-trial", "model": "gemini-pro" }, { "title": "Codellama 70b (Free Trial)", "provider": "free-trial", "model": "codellama-70b" }, { "title": "Mixtral (Free Trial)", "provider": "free-trial", "model": "mistral-8x7b" }, { "title": "Claude 3 Sonnet (Free Trial)", "provider": "free-trial", "model": "claude-3-sonnet-20240229" }, { "title": "Gemini Pro 1.5 Beta latest", "model": "gemini-1.5-pro-latest", "contextLength": 32000, "apiKey": [redacted], "provider": "google-palm" }, { "title": "Gemini Pro 1.0", "model": "gemini-pro", "contextLength": 1000000, "apiKey": [redacted] "provider": "google-palm" }, { "title": "Meta Codellama-70b 16k-100k$0.90/1Mtokens", "model": "codellama-70b", "apiKey": [redacted], "completionOptions": {}, "provider": "together" }, { "title": "Meta CodeLlama-70b-Python-hf 16k-100k $0.90/1Mtokens", "model": "codellama/CodeLlama-70b-Python-hf", "apiKey": [redacted], "completionOptions": {}, "provider": "together" }, { "title": "databricks/dbrx-instruct 32k$1.20/1M", "model": "databricks/dbrx-instruct", "apiKey": [redacted], "completionOptions": {}, "provider": "together" }, { "title": "deepseek-coder-33b-instruct 16k_$0.80/1M", "model": "deepseek-ai/deepseek-coder-33b-instruct", "apiKey": [redacted], "completionOptions": {}, "provider": "together" }, { "title": "Gemini Pro 1.0", "model": "gemini-pro", "contextLength": 32000, "apiKey": [redacted], "provider": "gemini" }, { "title": "Gemini 1.5 Pro", "model": "gemini-1.5-pro-latest", "contextLength": 125000, "apiKey": [redacted], "provider": "gemini" } ], "slashCommands": [ { "name": "edit", "description": "Edit selected code" }, { "name": "comment", "description": "Write comments for the selected code" }, { "name": "share", "description": "Export this session as markdown" }, { "name": "cmd", "description": "Generate a shell command" } ], "customCommands": [ { "name": "test", "prompt": "Write a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.", "description": "Write unit tests for highlighted code" } ], "contextProviders": [ { "name": "diff", "params": {} }, { "name": "open", "params": {} }, { "name": "terminal", "params": {} }, { "name": "problems", "params": {} }, { "name": "codebase", "params": {} }, { "name": "code", "params": {} }, { "name": "docs", "params": {} } ], "tabAutocompleteModel": { "title": "Tab Autocomplete", "provider": "free-trial", "model": "starcoder-7b" }, "allowAnonymousTelemetry": true, "embeddingsProvider": { "provider": "free-trial" }, "reranker": { "name": "free-trial" } }

evertjr commented 6 months ago

Hi, just adding a suggestion to this issue, it seems to me the codeqwen model when used as autocomplete likes to prefix its completions with an empty space, even if it's not needed.

Is it possible to maybe truncate the completion if the previous character is an empty space or a new line? I think it would prevent issues like this. it's clearly a model issue, since it doesn't happen with codegemma, but codeqwen is so good, would be nice if it worked properly.

coder543 commented 6 months ago

I’ve been encountering the issue with codegemma, for what it’s worth.

juanpabloxk commented 5 months ago

Same problem here with codeqwen:code local model. Is there any way to manipulate model's output before showing the suggestion?

olaf-2 commented 5 months ago

Having the same problem with some models from ollama: codeqwen:latest, deepseek-v2:latest, phi3:latest

but not with codegemma:7b-code

PowerfulGhost commented 5 months ago

same problem with codeqwen:7b seems like its a model problem: https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat/discussions/24

sestinj commented 5 months ago

I've added a filter for these leading spaces, which is now in the latest VS Code version, but hasn't yet made its way to JetBrains. I'll be releasing that this week

Will look into double completion for the release as well

continuedev / continue