Closed 3Simplex closed 2 months ago
I second this.
The program starts searching the selected collections...
tried this with 4 collections, to spot the fraction-of-a-second long text message "searching in localdocs:..."
...but immediately switches to the /default "generating response..." and "processing..." without parsing the collections which were however mentioned in the beginning but without them being really used (redundant here, but this is the idea :) )
I am able to reproduce this issue using a copy of some of 3Simplex's collections. It seems like the embeddings are missing for certain documents, due to the process getting interrupted somehow. These documents would have been re-indexed on every launch in previous versions of GPT4All because their modification timestamp did not match the database. Now they are only re-indexed the first time GPT4All v2.7.4 is started, and if that did not succeed then the collections will be broken until they are once again re-indexed (e.g. by changing the document snippet size) and it completes successfully.
We need to implement a way to know whether embeddings have been generated for a chunk so the program can continue where it left off.
I have also done as 3Simplex said, in the sense of changing a folder's contents as a collection, here's what I've done:
Done this with 3 distinct files in 3 distinct folders/categories. The result was the same - those collections were reindexed.
However, the issue is still here, - of reindexing existing collections. I see several collections being indexed again, immediately after program start, which were created even before 2.7.3 (I can't remember, was it 2.6.1 or a 2.7.x) and stayed that way since then...
Edit :) - the explanation of cebtenzzre clarifies as to why this would happen. Indeed, a flag or something would be handy, like Windows which knows that it didn't shut down properly :)
Should be fixed as of #2396 (aside from #2591, which is a related but distinct issue).
Bug Report
Pre-existing collections from before the update to 2.7.4 do not work after update. Only collections created in 2.7.4 work.
Steps to Reproduce
Expected Behavior
All collections were expected to function as usual.
Your Environment