bhavnicksm / chonkie

🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
https://pypi.org/project/chonkie/
MIT License
1.55k stars 57 forks source link

Refactor BaseChunker, SemanticChunker and SDPMChunker to support BaseEmbeddings #45

Closed bhavnicksm closed 4 days ago

bhavnicksm commented 4 days ago

This pull request includes several changes to improve the flexibility and functionality of the chunking and embedding models in the chonkie package. The most important changes include updating the BaseChunker class to support token counters, modifying the SemanticChunker to use the new embedding model interface, and updating the tests to reflect these changes.

Enhancements to BaseChunker:

Improvements to SemanticChunker:

Updates to embedding models:

Test updates: