Open zefhemel opened 1 month ago
Ah wait I see what you're doing there on the result page. You sub-divided pages (by paragraph?) and are grouping things right? Now that I'm using this on my actual space it makes more sense.
Ok have you considered creating a heading per page, or does that get too large?
Something like
# [[index]]
[[index@222]] Text fragment
[[index@404]] Text fragment 2
I don't think the similarity score is actually very useful to show, I assume you order them by it already?
editor_ui.tsx:146 Error running command Function searchCommand is not available in client
At least this part is because I had env: server
on searchCommand. I removed it and it seems to work as expected now. I'm not sure why, but switching between online/sync modes caused it to ignore that env setting?
Then, when I switch to Sync mode, it runs but I get no result. But when while on the result page switch back to Online mode, I do get the results...
This one I'm not sure on. I think I understand the problem though - when in sync mode, the client calls the embedding api and gets the embeddings to search for, and actually does do the search. The problem is the client (in sync mode) has no indexed embeddings because I limited that to the server only 😅
I'll change that to show "No results" or something more obvious, but I'm not sure what the fix is. I thought adding env: server
to this would cause it to run on the server, but it doesn't:
readPageSearchEmbeddings:
path: src/embeddings.ts:readFileEmbeddings
env: server
pageNamespace:
pattern: "🤖 .+"
operation: readFile
No that's ignored in sync mode, because env
is then undefined
because it needs to run everything on the server. I'm thinking if there's any technical way to force this to run things on the server even in sync mode, I think there isn't... 🤔
Best you can indeed is detect the environment and give an error, but that's not really a proper solution. Maybe there should be a version of invokeFunction
that forces the environment whatever it takes.
Another thing I ran into is performance. With https://github.com/silverbulletmd/silverbullet/pull/965 we now put a 5s limit on e.g. page fetches. On my setup with 1000+ pages indexed, this just times out. Even if we somehow disable this time limit, getting no feedback for 5s+ isn't great. Not sure how, but is there a way to stream or batch results so you avoid timeouts, and get some sort of feedback?
I added a system.invokeFunctionOnSystem
syscall that just always uses proxySyscall
. I'll do some more testing with it and create a PR to the main silverbullet repo if you think that'd work. It does seem like it works so far.
Another thing I ran into is performance. With https://github.com/silverbulletmd/silverbullet/pull/965 we now put a 5s limit on e.g. page fetches. On my setup with 1000+ pages indexed, this just times out. Even if we somehow disable this time limit, getting no feedback for 5s+ isn't great. Not sure how, but is there a way to stream or batch results so you avoid timeouts, and get some sort of feedback?
Just to make sure, you are hitting that limit on the search page? I do currently get all embeddings from the index before calculating the similarity to them. I can probably split that up and only process 100 or so at a time. I'm not sure how I'd show that as a progress on the page yet, but that would make it nicer
I changed the results page to look like this:
I'm also moving all of the search logic out of the pageNamespace readFile function and instead triggering it on an editor:pageLoad
event. There might be a better way to do it, but this lets me at least put a "loading" message on screen while waiting.
Next step is to figure out how to handle paginating the indexed embeddings or at least processing them in chunks.
Just to make sure, you are hitting that limit on the search page?
Yes, on the search page
@zefhemel can you give it another try when you get a chance? I had to temporarily add a sleep in the loop to test with, but I think the experience should be better now.
Now I just get nothing (this is online mode):
# Search results for "Zef"
Searching for "Zef"...
Generating query vector embeddings..
And then nothing. No error on the server either.
Just to be sure I added a new quick note with my name in it, on the server it suggests that it generated embeddings for it, but this search page doesn't find it.
Hmm, immediately after Generating query vector embeddings
, it calls the embeddings api, and then you should see Searching for similar embeddings
. If there was an error calling the embeddings api, there should be an error in the console though.
Assuming you're using ollama, is it accessible from both your server and client? I think the embeddings api call is going to come from the client.
To test and make sure I didn't have something cached, I just tried these steps:
Error: Invalid change range 1589 to 1686 (in doc of length 1680)
My SETTINGS
looks like this currently in my test space:
indexPage: index
libraries:
# The "Core" library is recommended for all users
- import: "[[!silverbullet.md/Library/Core/*]]"
- import: "[[!ai.silverbullet.md/Library/AICore/*]]"
ai:
indexEmbeddings: true
indexEmbeddingsExcludePages:
- passwords
indexEmbeddingsExcludeStrings:
- "**user**:"
- BLOOP BLARP
indexSummaryModelName: ollama-gemma2
indexSummary: false
chat:
# bakeMessages: false
# customEnrichFunctions:
# - enrichWithURL
textModels:
- name: gpt-4o
provider: openai
modelName: gpt-4o
embeddingModels:
- name: ollama-all-minilm
modelName: all-minilm
provider: ollama
baseUrl: https://ollama.lan.mydomain
requireAuth: false
It's not accessible from the client. So let me look into enabling that first in my setup.
I just pushed an update to display an error on the results page if it can't generate the embeddings, so it should at least give feedback now.
Okay I changed it so embeddings are always generated on the server.
So right now I have my main SB instance on a public VPS. Of course with authentication enabled. I also run Ollama as a separate docker container, however not exposed to the internet, just to other containers because it doesn't offer authentication out of the box. This means I will not be able to expose it to the internet unless I deploy some auth later on top of ollama that is compatible with your AI plug in some way.
I'm wondering if this is something to support. It could be done by indeed proxying LLM calls through the server. That is probably doable, except for all the streaming stuff in chat I suppose 🤔
I didn't run into this before, because so far I had only used OpenAI, which is of course perfectly reachable from the client.
I'm wondering if this is something to support. It could be done by indeed proxying LLM calls through the server. That is probably doable, except for all the streaming stuff in chat I suppose 🤔
I think it's worth supporting if possible. Ollama's a pretty simple solution to stand up, and I bet more people will have SB on the same server as ollama.
You can put ollama behind nginx or caddy and require an api key that way, but it's definitely more setup involved.
I haven't actually verified, but I am pretty sure the current way of running calls on the server would wait for the whole response (so no streaming). Maybe the ai plug could have a new http endpoint that the client always calls, and it would just stream the response back to the client without buffering it. Do you think it'd be hard to add support for streaming responses from the new http event listener stuff?
This is a bit weird... When I run the "AI: Search" command in Online mode I get:
editor_ui.tsx:146 Error running command Function searchCommand is not available in client
Then, when I switch to Sync mode, it runs but I get no result. But when while on the result page switch back to Online mode, I do get the results...
Although they look a bit weirdly indented: