justyns / silverbullet-ai

Plug for SilverBullet to integrate LLM functionality
https://ai.silverbullet.md/
GNU Affero General Public License v3.0
26 stars 1 forks source link

How to add an Embedding Model? #56

Open smileBeda opened 1 month ago

smileBeda commented 1 month ago

I can see the command "choose embedding model from list", however it is unclear to me how to add an embedding model.

I tried:

embedModels:
  - name: nomic-embed-text
    modelName: nomic-embed-text
    provider: ollama

Which did not work. It appears to be implemented, but in the readme I can see the specific commands are still undefined, so perhaps this is still WIP?

Thanks!

justyns commented 1 month ago

I need to update the readme, but this page has the information on how to enable and use embeddings: https://ai.silverbullet.md/Configuration/Embedding%20Models/

The setting is embeddingModels and would look like this, including enabling it:

  indexEmbeddings: true
  indexSummary: false
  chat:
    searchEmbeddings: false
    bakeMessages: true
  embeddingModels:
  # Only the first model is currently used
  - name: ollama-all-minilm
    modelName: all-minilm
    provider: ollama
    baseUrl: https://ollama.lan.my.domain
    requireAuth: false

The embeddings stuff is usable, but parts are still WIP. Normal embeddings generation and indexing is fine, but generating summaries of each note and then indexing those summaries is pretty flaky right now. Enabling searchEmbeddings for chat also works, but needs some help around prompting to make it better.

I may remove the 'select embedding model' command soon - it only affects the client, and all of the indexing happens on the server. Basically just the first model you define is the one used to generate embeddings.

justyns commented 1 month ago

Also, please feel free to leave any feedback you come up with if you try this out!

smileBeda commented 1 month ago

Note I use local nomic-embed-text. I have ollama running, and it (ollama) works otherwise (also with text model)

Settings:

ai:
  indexEmbeddings: true
  indexSummary: false
  imageModels:
  - name: dall-e-3
    modelName: dall-e-3
    provider: dalle
  textModels:
  - name: gpt-4o
    provider: openai
    modelName: gpt-4o
  - name: mistral-nemo
    modelName: mistral-nemo
    provider: ollama
    baseUrl: http://localhost:11434/v1
    requireAuth: false
  embeddingModels:
    - name: nomic-embed-text
      modelName: nomic-embed-text
      provider: ollama
      baseUrl: http://localhost:11434/v1
      requireAuth: false
  chat:
    searchEmbeddings: false
    bakeMessages: true
    userInformation: >
      I'm a software developer who likes taking notes.
    userInstructions: >
      Please give short and concise responses.  When providing code, do so in PHP unless requested otherwise.

Steps

Searching for "test string"... Generating query vector embeddings..

error ⚠️ Failed to generate query vector embeddings. Error: Unexpected non-whitespace character after JSON at position 4 (line 1 column 5)

And after that my space kind of breaks when I re-open the app (it doesn't want to pull in data anymore, just shows the header and blank content)

Weirdly, when I remove all settings related to indexing/embedding, then the /Space: Reindex returns "Done with page index!" So perhaps it is just not possible yet to use local ollama embedding model?

smileBeda commented 1 month ago

Oh wait. Issue two goes away if I use the proper url: http://localhost:11434 instead of http://localhost:11434/v1 However the Reindex command would't give success feedback still. Only if I remove the whole embeddingModels: then it gives success feedback.

smileBeda commented 1 month ago

Meh. And now it works even with embeddingModels:. I guess that was some cache.

Case solved I think! Thanks!

smileBeda commented 1 month ago

Uhm or not 🤣 It always returns the same 2 results no matter the query (even if unique enough to ensure a very specific result). Weird.

I will play more with it before giving more feedback ;)

justyns commented 1 month ago

Can you check your server's log and see if you see messages like this?

AI: Indexed 18 embedding objects for page Inbox/2024-08-03 04:55 in 3.299 seconds

To test, I'd recommend configuring embeddingModels and setting indexEmbeddings to true, then go and change a random page and check the log to see if it indexed it properly or had some sort of error. If it was successful, then it should be safe to do a full re-index. Depending on how big your space is, this could take a long time :D

It always returns the same 2 results no matter the query (even if unique enough to ensure a very specific result).

This might mean you only have 2 pages indexed so far with embeddings. You can confirm this using a query on any page like this:

```query
embedding select ref limit 20
```