continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
18.65k stars 1.56k forks source link

Jetbrains Integration Codebase Context Bug with FTS Database #2089

Open mphilippnv opened 2 months ago

mphilippnv commented 2 months ago

Before submitting your bug report

Relevant environment info

- OS:Ubuntu 22.04 LTS
- Continue: 0.0.62
- IDE: Pycharm 2024.2
- Model: Claude 3.5 Sonnet
- config.json:

  {
  "systemMessage": "You are a senior software developer who specializes in Python 3.11 with pydantic v2. You adhere to SOLID and DRY principles. You specialize in unit testing and refactors. Your unit tests should always be done with pytest and prefer using classes for tests. You always use static typing. You focus on performance, maintainability, readability and security. You always use existing docstrings and write new ones in the sphinx format, if needed. You favor pep 8 python formatting and you write method parameters on new lines.",
  "contextProviders": [
    {
      "name": "code"
    },
    {
      "name": "search"
    },
    {
      "name": "codebase"
    },
    {
      "name": "folder"
    },
    {
      "name": "url"
    },
    { "name": "diff" }
  ],
  "models": [
    {
      "model": "claude-3-5-sonnet-20240620",
      "contextLength": 200000,
      "title": "Claude 3.5 Sonnet",
      "apiKey": "REDACTED",
      "provider": "anthropic"
    }
  ],
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Prefer creating classes for the tests. Give the tests just as chat output, don't edit any file. Use pytest with classes.",
      "description": "Write unit tests for highlighted code"
    },
    {
      "name": "integration_test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of integration tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Prefer creating classes for the tests. Give the tests just as chat output, don't edit any file. Use pytest with classes.",
      "description": "Write unit tests for highlighted code"
    },
    {
      "name": "doc",
      "prompt": "Write a docstring for the current code in sphinx format.  Do not add new code.",
      "description": "Write the docstring"
    },
    {
      "name": "review",
      "prompt": "Review this code and suggestion improvements if needed.",
      "description": "Review code"
    },
    {
      "name": "refactor",
      "prompt": "{{{ input }}}\n\nRefactor the selected code to improve its readability, maintainability, and performance.  Improving coupling, cohesion and efficiency.",
      "description": "Refactor the selected code"
    },
    {
      "name": "refactor_for_test",
      "prompt": "{{{ input }}}\n\nRefactor the selected code to improve its unit testability. SOLID principles should be followed and docstrings must be maintained. Prefer using Protocol instead of abstract classes for interfaces. If no refactor is needed, please write 'No refactor needed'.",
      "description": "Refactor the selected code"
    }
  ],
  "embeddingsProvider": {
    "provider": "ollama",
    "model": "nomic-embed-text"
  },
  "allowAnonymousTelemetry": true
}

Description

Whenever I try to use @codebase in a prompt, I get an error stating Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts. The notification in Pycharm shows Error getting context items from codebase: TypeError: db. search is not a function.

I have rebuilt the index and still have the same issue. Another user on Discord also reported this occurring in WebStorm. This leads me to think it's an issue with Jetbrains in general, not just PyCharm.

To reproduce

This assumes you have Pycharm installed with the Continue plugin enabled already.

  1. Open the Continue panel
  2. Make sure the project is indexed
  3. Write any prompt and include @codebase in the prompt. I recommend a prompt that refers to a file that has dependencies on other files.
  4. Submit the prompt
  5. Observe the error. The prompt still runs, but the automatic context isn't found.

Log output

[2024-08-23T18:12:33] Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts
[2024-08-23T18:13:58] Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts
talShtark commented 2 months ago

I can confirm that I get the same behavior (after updating to EAP version).

I tried to remove the index folder and force re-index, which seems to run successfully, but still getting Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts when trying to use @codebase

bitbottrap commented 2 months ago

Me too. OS: Ubuntu 22.04 Continue: 0.9.199 IDE: VSCode 1.92.2 Model: ollama based

Other: chat works, several context providers don't

console.ts:137 [Extension Host] Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts --> in Database#all('SELECT fts_metadata.chunkId, fts_metadata.path, fts.content, rank\n' + ' FROM fts\n' + ' JOIN fts_metadata ON fts.rowid = fts_metadata.id\n' + ' JOIN chunk_tags ON fts_metadata.chunkId = chunk_tags.chunkId\n' + WHERE fts MATCH '"What" OR "does" OR "this" OR "project" OR "do"' AND chunk_tags.tag IN (?)\n + ' \n' + ' ORDER BY rank\n' + ' LIMIT ?', [ '/home/vscode/src::NONE::chunks', 13 ], [Function (anonymous)]) at new Promise () at process.processTicksAndRejections (node:internal/process/task_queues:95:5) y @ console.ts:137 $logExtensionHostMessage @ mainThreadConsole.ts:39 S @ rpcProtocol.ts:458 Q @ rpcProtocol.ts:443 M @ rpcProtocol.ts:373 L @ rpcProtocol.ts:299 (anonymous) @ rpcProtocol.ts:161 B @ event.ts:1230 fire @ event.ts:1261 fire @ ipc.net.ts:652 K.onmessage @ localProcessExtensionHost.ts:378

VSCode info: Version: 1.92.2 Commit: fee1edb8d6d72a0ddff41e5f71a671c23ed924b9 Date: 2024-08-14T17:29:30.058Z Electron: 30.1.2 ElectronBuildId: 9870757 Chromium: 124.0.6367.243 Node.js: 20.14.0 V8: 12.4.254.20-electron.0 OS: Linux x64 6.8.0-40-generic

adkr commented 2 months ago

IntelliJ IDEA 2024.2.0.2 (Ultimate Edition), Build #IU-242.20224.419, built on August 19, 2024 Continue: 0.0.64 OS: Win 11

Using context provider "codebase" for both claude-3-5-sonnet-20240620 or gpt-4o-mini-2024-07-18, With Prompt like @codebase describe structure

Causes message in Intellij: Error getting context items from codebase: TypeError: db. search is not a function

And the following in core.log file: Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts

As a result I can see in a chat that AI don't know what am I asking for.

Other context providers, I've tested like "folder" or "open" work without issues.

Configured EmbeddingsProvider is text-embedding-3-small

Moreover, not sure if related, right after start I can see in core.log:

Setup 
Core started 
Error loading config.ts:  Error [ERR_UNSUPPORTED_ESM_URL_SCHEME]: Only URLs with a scheme in: file, data are supported by the default ESM loader. On Windows, absolute paths must be valid file:// URLs. Received protocol 'c:'
Indexing: 0.0% complete, elapsed time: 0s, NaN file/sec

I've also tried to use early version of Continue -> same issues here.

Cheers, and looking for the fix! First, looking promising, IJ plugin worth to spend hours to configure - keep going!

DigiDr commented 2 months ago

Same issue. Reported on discord.

adkr commented 2 months ago

I can't wait to see codebase context in action in IntelliJ, so I've spent last few hours to try to spot the root cause... So... In my opinion, the problem is with FullTextSearchCodebaseIndex and its createTables method. https://github.com/continuedev/continue/blob/ed9bbba81b68263c7fd1ae54e0f65c981eec958b/core/indexing/FullTextSearch.ts#L22 I suppose, changing a place of fts tables creation to a constructor of the FTSCI should solve a race contition to read the tables from other places, which are quite few. https://github.com/continuedev/continue/blob/ed9bbba81b68263c7fd1ae54e0f65c981eec958b/core/indexing/CodebaseIndexer.ts#L91 https://github.com/continuedev/continue/blob/ed9bbba81b68263c7fd1ae54e0f65c981eec958b/core/autocomplete/retrieval.ts#L13 https://github.com/continuedev/continue/blob/ed9bbba81b68263c7fd1ae54e0f65c981eec958b/core/context/retrieval/fullTextSearch.ts#L15 I can see index.sqlite in Database View and there are no fts* tables. What do you think?

wensimin commented 2 months ago

Same problem, any help?

mainakchhari commented 2 months ago

came here to check on this same issue. I am getting the same bug on vscode with the recommended config for "Best overall experience" https://docs.continue.dev/setup/configuration#best-overall-experience

I suppose the issue is not specific to the embeddings model used

adkr commented 2 months ago

You are right. I've tested various embeddings. I believe its race condition to fts* tables before they are created. The more important question to me is: why aren't they created at all?. After clean plugin installation, after some time of using it, simply never. Looks to me as a scenario when some exception happens, breaking further execution, and not being logged.

AlexanderZhk commented 2 months ago

Same, tried with openai embeddings. Worked initially, broke on the second generation

bitbottrap commented 1 month ago

Codebase working again for me in 9.202 with VSCode.

sestinj commented 1 month ago

This typo here was causing the fts table not to be created https://github.com/continuedev/continue/commit/16c795ee52aa4379045e6e06dd858b076c595484

This fix is released already in vscode 0.9.202 as noted above, and will soon be released in Jetbrains!

Will-So commented 1 month ago

I've installed the EAP in Jetbrains and this solved my problem. Thanks!

mphilippnv commented 1 month ago

@Will-So Which EAP release did you install? I installed 0.0.68 and it's not recognizing any of my files when I try using the @ -> files option. This is Pycharm 2024.2.1

Will-So commented 1 month ago

I'm also 0.0.68 and 2024.2.1. Before updating @codebase didn't work at all but I could send entire files (e.g., @tox.ini) before updating.

Are you sure you have the same bug? (Error retrieving from FTS: Error: SQLITE_ERROR: no such table: fts)

mphilippnv commented 1 month ago

@Will-So No I don't see that error anymore. The logs look fine and codebase retrieval seems to work. But I cannot use specific file context anymore. See screenshot. It's always "Loading"

image

Here's my config:

{
  "systemMessage": "You are a senior software developer who specializes in Python 3.11 with pydantic v2. You adhere to SOLID and DRY principles. You specialize in unit testing and refactors. Your unit tests should always be done with pytest and prefer using classes for tests. You always use static typing. You focus on performance, maintainability, readability and security. You always use existing docstrings and write new ones in the sphinx format, if needed. You favor pep 8 python formatting and you write method parameters on new lines.",
  "contextProviders": [
    {
      "name": "file"
    },
    {
      "name": "code"
    },
    {
      "name": "search"
    },
    {
      "name": "repo-map"
    },
    {
      "name": "codebase"
    },
    {
      "name": "folder"
    },
    {
      "name": "url"
    },
    { "name": "diff" },
    {
      "name": "database",
      "params": {
        "connections": [
          {
            "name": "cc_local",
            "connection_type": "mysql",
            "connection": {
              "user": "root",
              "host": "localhost",
              "database": "codecritic",
              "password": "",
              "port": 3306
            }
          }
        ]
      }
    }
  ],
  "models": [
    {
      "title": "gpt4o - LLM Router",
      "provider": "openai",
      "contextLength": 128000,
      "model": "gpt-4o",
      "apiBase": "http://localhost:8051/v1",
      "apiKey": "api-key",
      "completionOptions": {
        "maxTokens": 12000
      }
    },
    {
      "title": "deepseekv2.5 - LLM Router",
      "provider": "openai",
      "contextLength": 32768,
      "model": "deepseekv25",
      "apiBase": "http://localhost:8051/v1",
      "apiKey": "api-key",
      "completionOptions": {
        "maxTokens": 12000
      }
    },
    {
      "title": "Deepseek Coder 2 0724 - LLM Router",
      "provider": "openai",
      "contextLength": 32768,
      "model": "deepseek_coder_2_0724",
      "apiBase": "http://localhost:8051/v1",
      "apiKey": "api-key",
      "completionOptions": {
        "maxTokens": 12000
      }
    },
    {
      "title": "deepseekv2 - LLM Router",
      "provider": "openai",
      "contextLength": 32768,
      "model": "deepseekv2",
      "apiBase": "http://localhost:8051/v1",
      "apiKey": "api-key",
      "completionOptions": {
        "maxTokens": 12000
      }
    },
    {
      "title": "Llama 3.1 405b - LLM Router",
      "provider": "openai",
      "contextLength": 128000,
      "model": "llama31_405b",
      "apiBase": "http://localhost:8051/v1",
      "apiKey": "api-key",
      "completionOptions": {
        "maxTokens": 12000
      }
    },
    {
      "title": "Llama 3.1 70b - LLM Router",
      "provider": "openai",
      "contextLength": 128000,
      "model": "llama31_70b",
      "apiBase": "http://localhost:8051/v1",
      "apiKey": "api-key",
      "completionOptions": {
        "maxTokens": 12000
      }
    }
  ],
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Prefer creating classes for the tests. Give the tests just as chat output, don't edit any file. Use pytest with classes.",
      "description": "Write unit tests for highlighted code"
    },
    {
      "name": "integration_test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of integration tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Prefer creating classes for the tests. Give the tests just as chat output, don't edit any file. Use pytest with classes.",
      "description": "Write unit tests for highlighted code"
    },
    {
      "name": "doc",
      "prompt": "Write a docstring for the current code in sphinx format.  Do not add new code.",
      "description": "Write the docstring"
    },
    {
      "name": "review",
      "prompt": "Review this code and suggestion improvements if needed.",
      "description": "Review code"
    },
    {
      "name": "refactor",
      "prompt": "{{{ input }}}\n\nRefactor the selected code to improve its readability, maintainability, and performance.  Improving coupling, cohesion and efficiency.",
      "description": "Refactor the selected code"
    },
    {
      "name": "refactor_for_test",
      "prompt": "{{{ input }}}\n\nRefactor the selected code to improve its unit testability. SOLID principles should be followed and docstrings must be maintained. Prefer using Protocol instead of abstract classes for interfaces. If no refactor is needed, please write 'No refactor needed'.",
      "description": "Refactor the selected code"
    }
  ],
  "embeddingsProvider": {
    "provider": "ollama",
    "model": "nomic-embed-text"
  },
  "allowAnonymousTelemetry": true
}