nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
70.84k stars 7.71k forks source link

Embedding process become slow when editing files or while embedding several LocalDocs folder at one time #3183

Open lauhub opened 1 week ago

lauhub commented 1 week ago

Bug Report

Embedding becomes slow after a modification of one of the files contained in LocalDocs folder

Steps to Reproduce

  1. Embed a first local folder in LocalDocs (e.g. /home/user/a) that contains Markdown files
  2. Wait for end of process (it is fast actually)
  3. Embed a second local folder in LocalDocs (e.g. /home/user/b)
  4. Without waiting for end of step 3, edit one of the Markdown files (in an external editor) that is in /home/user/a subtree (in an external editor)
  5. Embedding process becomes very slow on both trees /home/user/a and /home/user/b and takes hours

Expected Behavior

Embedding process after step 4 should be as fast as in step 2

Your Environment

Additional notes

Same problem occurs when trying to embed 3 entries at the same time

It seems that chat process takes 100% of each CPU core during embedding process. Quitting application takes a long time (showing "Force quit" option for several minutes).

Trying to remove one of the LocalDocs entry and then quitting application (after waiting) seems to solve the problem when running again chat4all (but without adding a second LocalDocs entry).

It seems that processing several LocalDocs entries at the same time has the same effect.