continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://continue.dev/docs
Apache License 2.0
13.13k stars 899 forks source link

Indexing a Unity project results in thousands of "too many open file" errors. #1535

Open JohnSmithToYou opened 1 week ago

JohnSmithToYou commented 1 week ago

Before submitting your bug report

Relevant environment info

- OS: Windows 10 w/wsl2 running Ollama
- Computer: 64GB ram w/two 4090s
- Continue: v0.8.40
- IDE: VSCode 1.90.2

Description

Once Continue starts to index my Unity project tens of thousands of error messages occur:

console.ts:137 [Extension Host] Error reading file Unknown (FileSystemError) (FileSystemError): Error: EMFILE: too many open files, open 'c:\src\UnityTestBed\Library\PackageCache\com.unity.textmeshpro@3.0.6\Documentation~\TextMeshPro.md.meta'
    at P.e (c:\Users\someone\AppData\Local\Programs\Microsoft VS Code\resources\app\out\vs\workbench\api\node\extensionHostProcess.js:152:6515)
    at Object.readFile (c:\Users\someone\AppData\Local\Programs\Microsoft VS Code\resources\app\out\vs\workbench\api\node\extensionHostProcess.js:152:4465)
    at async _VsCodeIdeUtils.readFile (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:366778:25)
    at async VsCodeIde.readFile (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:367318:16)
    at async c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:78836:30
    at async Promise.all (index 27677)
    at async getAddRemoveForTag (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:78834:8)
    at async getComputeDeleteAddRemove (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:78938:41)
    at async CodebaseIndexer.refresh (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:358185:45)
    at async Core.refreshCodebaseIndex (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:358992:26)

After about 20K-30K errors I start getting this instead:

console.ts:137 [Extension Host] Failed to load parser for file c:\src\UnityTestBed\Assets\VRTemplateAssets\Scripts\XRKnob.cs: 
log.ts:439   ERR [Extension Host] Unable to load language for file c:\src\UnityTestBed\Assets\VRTemplateAssets\Scripts\XRPokeFollowAffordanceFill.cs RuntimeError: table index is out of bounds
    at wasm://wasm/000b54aa:wasm-function[237]:0x29e6a
    at _Parser.initialize (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:39356:19)
    at new _Parser (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:38279:16)
    at getParserForFile (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:40132:20)
    at async codeChunker (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:79467:18)
    at async chunkDocumentWithoutId (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:79510:24)
    at async chunkDocument (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:79521:20)
    at async _ChunkCodebaseIndex.update (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:114028:28)
    at async CodebaseIndexer.refresh (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:358192:47)
    at async Core.refreshCodebaseIndex (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:358992:26)
console.ts:137 [Extension Host] Unable to load language for file c:\src\UnityTestBed\Assets\VRTemplateAssets\Scripts\XRPokeFollowAffordanceFill.cs RuntimeError: table index is out of bounds
    at wasm://wasm/000b54aa:wasm-function[237]:0x29e6a
    at _Parser.initialize (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:39356:19)
    at new _Parser (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:38279:16)
    at getParserForFile (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:40132:20)
    at async codeChunker (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:79467:18)
    at async chunkDocumentWithoutId (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:79510:24)
    at async chunkDocument (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:79521:20)
    at async _ChunkCodebaseIndex.update (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:114028:28)
    at async CodebaseIndexer.refresh (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:358192:47)
    at async Core.refreshCodebaseIndex (c:\Users\someone\.vscode\extensions\continue.continue-0.8.40-win32-x64\out\extension.js:358992:26)

Shortly indexing grinds to a halt and it stops at 25%. My SSD is at 80% load. Continue doesn't seem to respond to anything. I've tried a different embeddingsProvider and rebuilt the index five times but that didn't help.

Note: I created a .continueignore file hoping to reduce the number of files it loads (I tested it with git check-ignore), but Continue seems to ignore it! Bare in mind, a tiny Unity project like mine is typically no less than 50K files!

sample.continueignore.txt

My config:

{
  "models": [
    {
      "title": "Codestral 22b",
      "provider": "ollama",
      "model": "codestral-22b-4k:latest"
    },
    {
      "title": "Clive/Codestral 22b",
      "provider": "ollama",
      "model": "codestral-22b-4k-clive:latest"
    }
  ],
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.",
      "description": "Write unit tests for highlighted code"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Codestral 22b",
    "provider": "ollama",
    "model": "codestral-22b-4k:latest"
  },
  "allowAnonymousTelemetry": false,

  "embeddingsProvider": {
    "provider": "transformers.js"
  }
}

To reproduce

1) Download a random Unity project from github. Like this one: https://github.com/SikPang/Unity_VampireSurvivors_Copy 2) Load it into VS Code + Continue 3) Observe

Note: Normally you need to install Unity in order to generate all of the files extra files. Developers are suppose to strip out the generated folders using .gitignore, but the project I linked above forgot to do this. It is a good snapshot of a typical Unity project under development. It will take a while to download. This means you don't need to install Unity to reproduce the problem. It just won't compile.

Log output

See above.
JohnSmithToYou commented 1 week ago

I tested my (.continueignore using ripgrep/ignore (the tool you are using, I think) and it processes my file correctly so I think it's on your end.

One thing to keep in mind, traversing folders without using ripgrep/ignore is tricky because ripgrep/ignore (and git) have special logic because of negation patterns. Also, I see you're ignoring many types of extensions. I suggest don't bother. There is no way to ever get that right! Who knows what kind of project uses your tool. Why not provide a default .ignore file that people can either use or customize? It will clean up your code too.