dleemiller / WordLlama

Things you can do with the token embeddings of an LLM
MIT License
1.39k stars 47 forks source link

Feature / Add Semantic Splitting #19

Closed dleemiller closed 1 month ago

dleemiller commented 2 months ago

Add semantic splitting / chunking:

Example

dleemiller commented 2 months ago

https://github.com/dleemiller/WordLlama/blob/feature/semantic-splitter/tutorials/semantic_split/semantic_split.md

I think I've worked out a decent process to accomplish this. I'm going to start moving a few of these into algorithm code and iterating on the process.

dleemiller commented 1 month ago

https://github.com/dleemiller/WordLlama/pull/26

dleemiller commented 1 month ago

Feature available in v0.2.9!