brianpetro / obsidian-smart-connections

Chat with your notes & see links to related content with AI embeddings. Use local models or 100+ via APIs like Claude, Gemini, ChatGPT & Llama 3
https://smartconnections.app
GNU General Public License v3.0
2.72k stars 181 forks source link

[ERROR with new embeddings model] #430

Open quelsemme opened 9 months ago

quelsemme commented 9 months ago

It appears to be hanging at 50/1650 smart_blocks and not creating embeddings at all...

After waiting for the process for ages, I checked the '.smart-connections' folder in my system explorer and found

smart_blocks-SmartEmbedOpenAIText3LargeAPI.ajson 3866kb smart_notes.ajson (657kb)

But no embeddings JSON file. Previous embeddings-3.json was 45MB

Here is the console:

[Violation] 'click' handler took 2270ms
plugin:smart-connections:830 Saving: smart_notes.ajson
plugin:smart-connections:847 Saved smart_notes.ajson in 15ms
plugin:smart-connections:6936 no blocks to prune
plugin:smart-connections:6455 Error: Request failed, status 400
    at new t (app.js:1:1762600)
    at gq (app.js:1:1762792)
    at app.js:1:1763469
    at app.js:1:237258
    at Object.next (app.js:1:237363)
    at a (app.js:1:236081)
plugin:smart-connections:6809 error importing notes
plugin:smart-connections:6810 TypeError: Cannot read properties of null (reading 'usage')
    at SmartEmbedOpenAIText3LargeApi.embed_batch (plugin:smart-connections:6407:39)
    at async SmartBlocks.ensure_embeddings (plugin:smart-connections:6736:11)
    at async SmartNotes.import (plugin:smart-connections:6806:13)
plugin:smart-connections:830 Saving: smart_notes.ajson
plugin:smart-connections:830 Saving: smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:6714 Invalid block, skipping save:  {key: 'Excalidraw/Drawing 2023-04-08 21.51.34.excalidraw.md#Text Elements', path: 'Excalidraw/Drawing 2023-04-08 21.51.34.excalidraw.md#Text Elements', embedding: {…}, text: '', hash: 'd41d8cd98f00b204e9800998ecf8427e', …}
plugin:smart-connections:6660 [Violation] 'setTimeout' handler took 184ms
plugin:smart-connections:847 Saved smart_notes.ajson in 201ms
plugin:smart-connections:847 Saved smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson in 211ms
plugin:smart-connections:7673 View inactive, skipping render nearest
quelsemme commented 9 months ago

Okay, so not a problem anymore. I rebooted the plugin (strange that rebooting Obsidian didn't work but turning the plugin off and on did) and force_refreshed (though not for the first time either).

Got this and my new smart_blocks-SmartEmbedOpenAIText3LargeApi.json is 63597kb which is much more like it... I thought I'd leave this instead of deleting the post, in case the error points you to a bug and not an anomaly.

closing smart connections view
VM285 plugin:smart-connections:7172 unloading plugin
VM285 plugin:smart-connections:7660 closing smart connections view
plugin:smart-connections:7178 Loading Smart Connections v2...
plugin:smart-connections:7203 e {appMenuBarManager: e, hotkeyManager: e, customCss: t, embedRegistry: t, viewRegistry: t, …}
plugin:smart-connections:787 Loading: .smart-connections/smart_notes.ajson
plugin:smart-connections:787 Loading: .smart-connections/smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:793 Loaded: smart_notes.ajson
plugin:smart-connections:793 Loaded: smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:6617 1649
app.js:1 [Violation] 'setTimeout' handler took 52ms
plugin:smart-connections:7230 v2 updater
plugin:smart-connections:7242 2.0.71 2.0.71
plugin:smart-connections:7244 Already up to date
app.js:1 [Violation] 'click' handler took 2035ms
plugin:smart-connections:830 Saving: smart_notes.ajson
plugin:smart-connections:847 Saved smart_notes.ajson in 28ms
plugin:smart-connections:6936 no blocks to prune
plugin:smart-connections:830 Saving: smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:847 Saved smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson in 1278ms
plugin:smart-connections:830 Saving: smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:847 Saved smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson in 2548ms
plugin:smart-connections:830 Saving: smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:830 Saving: smart_notes.ajson
plugin:smart-connections:824 Already saving: smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson
plugin:smart-connections:847 Saved smart_blocks-SmartEmbedOpenAIText3LargeApi.ajson in 3928ms
plugin:smart-connections:847 Saved smart_notes.ajson in 541ms
quelsemme commented 9 months ago

Whoops. False solution...

I asked the chat a question and it didn't know anything about the topic...

Context was as follows:

Anticipate the type of answer desired by the user.Imagine the following 20 notes were written by the user and contain all the necessary information to answer the user's question.Begin responses with "Based on your notes..."
---BEGIN #1---

---END #1---
---BEGIN #2---

---END #2---
---BEGIN #3---

---END #3---
---BEGIN #4---

---END #4---
---BEGIN #5---

---END #5---
---BEGIN #6---

---END #6---
---BEGIN #7---

---END #7---
---BEGIN #8---

---END #8---
---BEGIN #9---

---END #9---
---BEGIN #10---

---END #10---
---BEGIN #11---

---END #11---
---BEGIN #12---

---END #12---
---BEGIN #13---

---END #13---
---BEGIN #14---

---END #14---
---BEGIN #15---

---END #15---
---BEGIN #16---

---END #16---
---BEGIN #17---

---END #17---
---BEGIN #18---

---END #18---
---BEGIN #19---

---END #19---
---BEGIN #20---

---END #20---

`

quelsemme commented 9 months ago

Additional response:

Based on your notes, which are currently empty, I cannot provide specific insights into your research into theatre adaptation. If you could provide details or findings from your research, I would be able to offer a more tailored response regarding themes, methodologies, key findings, or any particular challenges or successes you've encountered in your study of theatre adaptation.

brianpetro commented 9 months ago

@quelsemme thanks for the report, I'll be looking into this shortly!

🌴

brianpetro commented 9 months ago

@quelsemme

This is a weird issue.

I observed it happening once, but I haven't been able to reproduce it reliably.

If you have any recommendations on reproducing it, let me know. Otherwise, I'll watch for what might be causing it as I work through the code.

🌴

quelsemme commented 9 months ago

First attempt this morning got this error in the console

plugin:smart-connections:6419
Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'length')
    at SmartEmbedOpenAIText3LargeApi.request_embedding (plugin:smart-connections:6419:25)
    at SmartEmbedOpenAIText3LargeApi.embed (plugin:smart-connections:6400:37)
    at SmartSearchApi.search (plugin:smart-connections:7846:57)
    at SmartConnectionsChatView.get_context_hyde (plugin:smart-connections:8525:41)
    at async SmartConnectionsChatView.initialize_response (plugin:smart-connections:8213:23)

Attempt 1: Changed the chat bot to GPT-4 Turbo and got the same response as above 'no information in your notes about...' Attempt 2: Changing to GPT 3.5 was a little more fruitful, but only one of the notes it picked was related to the topic I mentioned. Attempt 3: I put quotation marks around the term (someone's name) and it pulled more relevant information... Attempt 4: Changed back to GPT-4 Turbo and asked a very specific question again (name removed for privacy):

Based on my notes, tell me everything you know about playwright "________________"

I have a note named after this playwright, plus at least 3 other names which refer to him as a playwright. Got no response. Here is the prompt context.

```prompt-context
Anticipate the type of answer desired by the user.Imagine the following 20 notes were written by the user and contain all the necessary information to answer the user's question. Begin responses with "Based on your notes..."
---BEGIN #1---

---END #1---
---BEGIN #2---

---END #2---
---BEGIN #3---

---END #3---
---BEGIN #4---

---END #4---
---BEGIN #5---

---END #5---
---BEGIN #6---

---END #6---
---BEGIN #7---

---END #7---
---BEGIN #8---

---END #8---
---BEGIN #9---

---END #9---
---BEGIN #10---

---END #10---
---BEGIN #11---

---END #11---
---BEGIN #12---

---END #12---
---BEGIN #13---

---END #13---
---BEGIN #14---

---END #14---
---BEGIN #15---

---END #15---
---BEGIN #16---

---END #16---
---BEGIN #17---

---END #17---
---BEGIN #18---

---END #18---
---BEGIN #19---

---END #19---
---BEGIN #20---

---END #20---

Attempt 5: Back to GPT-4 (8k). I get the results I would expect, though the notes being fed to GPT aren't just about the specific person, but I assume that's because it's a precise topic and it's being overzealous which is not a bad thing.

Attempt 6: GPT-3.5k Turbo (16k) Same results as above. Attempt 7: GPT-4 Turbo (128k) No available information to respond to. Attempt 8: GPT-4 Turbo (128k)

Prompt:

Based on my notes, tell me about my area of study

Response:

Based on your notes, it appears there is no information provided about your area of study. To assist you accurately, could you please provide more details or clarify your area of interest?

Reason is the same as above trials - Smart Connections hasn't sent any notes to the GPT. Attempt 9: GPT-4 (8k) Same prompt as above. Results as expected - Smart Connections sent appropriate notes and received an appropriate reply.

At this point, my assumption is twofold. 1) Something is going wrong where it sends 0 notes to the GPT 2) GPT-4 Turbo (128k) exacerbates this issue.

Temporary solution

Over-The-Edge commented 9 months ago

@brianpetro from my other topic as requested: "Sorry I have been out, I haven't done the force refresh as found it worked with 8k and 16k. Interesting with GPT4-128k - when it produces a response in the response it says "unable to find notes" but when I use the preview option to copy context the notes are in the context. Somehow the context is there but it is not using it with the prompt with 128k version but it is with others. (this is all with the large embedding model)

Over-The-Edge commented 9 months ago

@brianpetro Now smart chat no longer works - with any version of embedding or GPT model version. Process starts and then I get "error in embedding search", even after refreshing notes, closing/reopening obsidian etc. Do I need to do a force refresh? Did not have the issue other had of it hanging when processing embeddings

alex-astronomer commented 9 months ago

I also cannot use Smart Chat any more. Same issue. Force refresh didn't do the trick.

EDIT: on v1.6 it works fine with GPT-4!

v2.0.79

Error from console:

Uncaught (in promise) TypeError: Reduce of empty array with no initial value
    at Array.reduce (<anonymous>)
    at SmartConnectionsChatView.get_nearest_until_next_dev_exceeds_std_dev (plugin:smart-connections:8663:23)
    at SmartConnectionsChatView.get_context_hyde (plugin:smart-connections:8644:20)
    at async SmartConnectionsChatView.initialize_response (plugin:smart-connections:8330:23)
blakejwc commented 9 months ago

Same, I added some log lines to double-check that search was returning no results.

image

quelsemme commented 9 months ago

Same issue here.

plugin:smart-connections:7952 TypeError: vector1.reduce is not a function
    at cos_sim (plugin:smart-connections:5890:34)
    at Object.entries.reduce.min (plugin:smart-connections:6972:23)
    at Array.reduce (<anonymous>)
    at SmartBlocks.nearest (plugin:smart-connections:6964:52)
    at SmartSearchApi.search (plugin:smart-connections:7949:43)
    at SmartConnectionsChatView.get_context_hyde (plugin:smart-connections:8642:41)
    at async SmartConnectionsChatView.initialize_response (plugin:smart-connections:8330:23)
plugin:smart-connections:8663 Uncaught (in promise) TypeError: Reduce of empty array with no initial value
    at Array.reduce (<anonymous>)
    at SmartConnectionsChatView.get_nearest_until_next_dev_exceeds_std_dev (plugin:smart-connections:8663:23)
    at SmartConnectionsChatView.get_context_hyde (plugin:smart-connections:8644:20)
    at async SmartConnectionsChatView.initialize_response (plugin:smart-connections:8330:23)
Over-The-Edge commented 9 months ago

@brianpetro I see that you just pushed an update thanks. After upgrade - it was stuck on embedding 1060/1063. Restrated obsidian still stuck. Then went to plugin and saw now two options for embedding text and for blocks. Changed these from blank to Large OpenAI model. Processed 1096 embeddings successfully. Now retried my prompt that worked a few days ago before these upgrades and now still hangs using ChatGPT 3.5 16K. says "error in embedding search"

brianpetro commented 9 months ago

@quelsemme @Over-The-Edge thanks for being so on top of things!

I see the issue, and I'm working on a fix now!

🌴

brianpetro commented 9 months ago

@blakejwc @alex-astronomer @quelsemme @Over-The-Edge

This issue that was introduced today should be fixed in v2.0.83.

I'm going to leave this issue open because I'm still unsure about the "empty notes" issue, as shown by @quelsemme in https://github.com/brianpetro/obsidian-smart-connections/issues/430#issuecomment-1913396381

Sorry for the inconvenience, everyone!

🌴

brianpetro commented 9 months ago

And thank you for bringing this to my attention so quickly, it's a huge help 🙏

🌴

Over-The-Edge commented 9 months ago

Thanks Brian. I’m travelling for the afternoon but will test tonight and let you know. Thanks for great service.

Michael Boyens

On Tue, 30 Jan 2024 at 12:45 pm, WFH Brian @.***> wrote:

@blakejwc https://github.com/blakejwc @alex-astronomer https://github.com/alex-astronomer @quelsemme https://github.com/quelsemme @Over-The-Edge https://github.com/Over-The-Edge

This issue that was introduced today should be fixed in v2.0.83.

I'm going to leave this issue open because I'm still unsure about the "empty notes" issue, as shown by @quelsemme https://github.com/quelsemme in #430 (comment) https://github.com/brianpetro/obsidian-smart-connections/issues/430#issuecomment-1913396381

Sorry for the inconvenience, everyone!

🌴

— Reply to this email directly, view it on GitHub https://github.com/brianpetro/obsidian-smart-connections/issues/430#issuecomment-1915911379, or unsubscribe https://github.com/notifications/unsubscribe-auth/BFULYQSYVNOKXHGXAAUIQRDYRBGETAVCNFSM6AAAAABCL3X67OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJVHEYTCMZXHE . You are receiving this because you were mentioned.Message ID: @.***>

Over-The-Edge commented 9 months ago

@brianpetro After the latest update my prompt now works with Large Embedding for blocks and text and with 3.5-16K and 4-8k - BUT 128k generates but says words to the effect - "can not find any notes", although when I copy the context after the generation - the notes are in the context??

Over-The-Edge commented 9 months ago

@brianpetro have updated to your .90 release. all versions of GPT generate a response. GPT4-128k still does NOT use the notes context in the generation and says "there are no notes" - but copying and viewing prompt context shows it has context. BUT bigger issue is that it is NOT respecting folder exclusions and now all models are extracting notes from excluded folders - response is "Elevated Word/Trusted AI/TWF" etc. But Elevated Word is the only folder it was previously retrieving from and "Trusted AI" etc is excluded. - Any help appreciated

Over-The-Edge commented 9 months ago

@brianpetro Also synced notes to Smart Connections GPT - now it finds all?? notes 1076 (versus 1096 when embedding process ran). But now having same issue in that it can't find notes say on the theme of "boldness", when previously it could. It can find a specific named note though, but looks like search is broken - any help appreciated

brianpetro commented 9 months ago

@Over-The-Edge version 2.0.92 incoming!

Re: file/folder exclusions

I know how important privacy is to the Obsidian community, so I immediately began reviewing what could have caused a possible slip in the exclusion system. The good news is that I only found something that would right itself after a little bit of time. In some circumstances, the exclusions weren't updated until the next restart. That's fixed now.

Now, after adding exclusions, you can use the "Prune Embeddings" (removes only) or "Refresh Notes" (removes and creates new embeddings if applicable) to immediately bring your embeddings up-to-date with exclusions. Similarly, syncing your notes to ChatGPT will also immediately work as expected.

Also, I ensured both the ChatGPT sync and the embedding process used the same method. Furthermore, I wrote some unit tests to ensure the method is working as expected.

Finally, one last thing on the exclusions: While I know the current text input isn't ideal, and I plan on improving it in the future, I added a bit of visibility to the exclusions on the settings page (see image).

Screenshot 2024-01-30 at 9 08 57 PM


Re: unexpected search results

Are you using the same Ada model as in v1? Different models will behave differently.

Also, while I know your query isn't explicitly a keyword query, you may want to see this comment from earlier today as I think the reasoning and future outlook will benefit your query the same.

🌴

Over-The-Edge commented 9 months ago

@brianpetro I have been running some more tests with Large Embedding Model for both blocks and text. Prompt is:

Boldness

“to boldly go where no man has gone before.”

Use the above or identify and use relevant themes to search my notes for content .

Format

Sources [Title of each note]

all GPT Models work including GPT4-8k - ie generate 5 note titles with an extract that aligns with Boldness.

EXCEPT GPT4-128k which says "I could find no notes on that topic etc" -

When I run on ChatGPT using smart connections and same prompt. It returns one note and says it tried with other themes but couldn't find anything? I'll now retest with original ADA that was working previously with GPT4-128k and Smart GPT and advise in next comment

Over-The-Edge commented 9 months ago

@brianpetro Well that didn't go according to plan. Switch back to ADA for Text and then switch back to ADA for Blocks. On second switch - popped up with do you want to delete 1062 embeddings - OK or delete - clicked delete. Then clicked refresh notes - then said do you want to create embeddings - clicked yes. Usually then shows me progress of creating embeddings but nothing. Then closed obsidian and re-opened then popup - do you want to delete 1062 embeddings - said yes. Then refreshed notes - do you want to create - yes - but again no progress. Then closed/reopened - nothing - so did force refresh - continue - yes - again no popup and now no embeddings it seems? Changed to OpenAI Small 8191,1536 - attempted to reprocess bu start embeddings does nothing. I'll keep testing and advise

Over-The-Edge commented 9 months ago

@brianpetro Moving to other embedding models makes no difference - it says start embedding - but doesn't actually seem to do anything - ill leave it with you now.

oscaromsn commented 9 months ago

I'm facing the same problem as @Over-The-Edge with the pop-up questioning whether to delete embeddings. However, here, the embedding process notification on the top right side appears, but after that, every time I open a note, the question 'No embeddings found for [my note name].' appears. After clicking the 'Start Embedding' button, nothing happens. I've already tried different models, but it seems like something not related to that.

brianpetro commented 9 months ago

@oscaromsn @Over-The-Edge that bug was fixed in 2.0.93.

Thanks for bringing this to my attention!

@Over-The-Edge I'll be following up about the 128K GPT-4 model in a little bit.

Thanks for the updates!

🌴

Over-The-Edge commented 9 months ago

@brianpetro Everything now works again with the large embedding model thanks. Look forward to your update on 128k and excited more for 2.1 as well.

brianpetro commented 9 months ago

@Over-The-Edge I think the 128K issue might be fixed in v2.0.98. Let me know if it works for you now, as I didn't properly reproduce the issue, but I did update some logic and the model seems to be working for me with no issue.

Over-The-Edge commented 9 months ago

@brianpetro I can only access .97 but 128k WORKS with that - fantastic thanks!

brianpetro commented 9 months ago

@Over-The-Edge sorry that was a typo, 2.0.97 is the latest version. Glad it works!