continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
16.02k stars 1.22k forks source link

Error throw when using @codebase in the prompt: Cannot read properties of undefined (reading 'sort') #1848

Open CallMeLaNN opened 1 month ago

CallMeLaNN commented 1 month ago

Before submitting your bug report

Relevant environment info

- OS: MacOS 12.6
- Continue: v0.8.43
- IDE: VSCode 1.91.1 (Universal)
- Model: Any
- config.json:

  {
    "models": [...],
    ...,
    "embeddingsProvider": {
      ... // Any provider I tried, transformer.js, ollama and openai (voyage)
    },
    "reranker": {
      "name": "voyage",
      "params": {
        "model": "rerank-1",
        "contextLength": 8000,
        "apiKey": "abcd"
      }
    },
    "contextProviders": [
      ...,
      {
        "name": "codebase",
        "params": {
          "nRetrieve": 25,
          "nFinal": 5,
          "useReranking": true
        }
      },
    ],
  }

Description

I'm evaluating different embeddings models and config. I don't have embeddings issue before but since yesterday, I got this error. I try to change different settings, try different provider, use back the old one but it doesn't work anymore.

I don't provide the embeddings config above because I tried each one of the provider but still doesn't work. It was working before. Let me know if you still need one example of mine.

To reproduce

  1. Update embeddings config
  2. In the Continue chat, click green dot besides model dropdown list to force reindex.
  3. Clear the Continue log in VSCode OUTPUT tab.
  4. Start Continue chat with "@codebase "
  5. VSCode notification and DevTool Console produce the error above
  6. Observe the Continue log. It doesn't add the relevant context to LLM
  7. LLM give general answer as it don't know about the code
  8. Try restart the VSCode and repeat, try different embeddings model, provider and config. Same result.

Log output

==========================================================================
==========================================================================
Settings:
contextLength: 4096
model: meta-llama/llama-3.1-8b-instruct:free
maxTokens: 1024
log: undefined

############################################

<user>
where is the relevant code and files that set the theming for this website?

==========================================================================
==========================================================================
Completion:

Unfortunately, I don't have the capability to browse the internet or access any specific website's code or files. However, I can give you some general information about where to look for theming-related code on a website.

Most modern websites use a combination of HTML, CSS, and JavaScript to display their content and apply their theme. Here are some places you might look to find the relevant code and files:

...

Keep in mind that the location and naming conventions may vary depending on the website's architecture and technology stack. To find the relevant code and files, you can use a combination of search engines, developer tools, and DNS reconnaissance techniques.

EDIT: Before this thing happened, I know my embedding was working fine, I saw the log produce a relevant context of my codebase

CallMeLaNN commented 1 month ago

image

This is the notification.

I tried to remove ~/.continue/index in case if it get messed up but using codebase retrieval with new index cache, the error still appear.

I tried in a few projects web and python in case related to project, still not working. However it is fine in a new project with 1 file.


So maybe there's a cache related to project, I'm not sure anywhere else other than ~/.continue/index/lancedb/{project path}... but I already clear it for a fresh index.

Hopefully someone can tell me anywhere else I can clear the cache.

fry69 commented 1 month ago

What happens if you remove the re-ranker definition from your config.json file?

CallMeLaNN commented 1 month ago

What happens if you remove the re-ranker definition from your config.json file?

You are right. I removed and it is working fine in my project.

Anyway to get the log or check further? So far the embedding result contain some irrelevant info, some LLM can't answer it correctly.

fry69 commented 1 month ago

See here for logs etc -> https://docs.continue.dev/troubleshooting#llm-prompt-logs

CallMeLaNN commented 1 month ago

That log doesn't include reranker related. Continue only log i/o for LLM. It doesn't log i/o for embeddings and reranker models before sent to LLM like my log above.

I mean I can't use reranker for my projects and not sure how to check further without any log.

... but wait, I get it now, the reranker return unexpected result. I wait for a while if I hit the rate limit.

It turn out that I hit the TPM rate limit for embeddings size times codebase nRetrieve params. I can avoid the error by adjusting the nRetrieve. I'm not sure if I can adjust chunk size.

So maybe an improvement can be made to catch and log unexpected response. I never expect it was due to the invalid response or rate limited.