Textualization / the-ragged-edge-box

RAGGED EDGE BOX: Your Personal AI-Powered Document Search System
Other
9 stars 2 forks source link

The window size for the local LLM should be in sync with the indexes #15

Closed DrDub closed 5 days ago

DrDub commented 2 months ago

The current defaults were appropriate for a different LLM, the chunk size might be incorrect.

DrDub commented 5 days ago

It currently ships bling-stable-lm-3b-4e1t-v0 which has a maximum number of positional embeddings of 4,096.

DrDub commented 5 days ago

The chunk size is given by the embedder used in the indexer, which for the current embedder is 512.

As such, this is currently not a problem, at 512 this is smaller than any local LLM context size.