bhavnicksm / chonkie

🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
https://pypi.org/project/chonkie/
MIT License
1.69k stars 61 forks source link

[Fix] Refactor WordChunker, SentenceChunker pre-chunk splitting for reconstruction tests + minor changes #53

Closed bhavnicksm closed 5 days ago

bhavnicksm commented 5 days ago

This pull request primarily focuses on renaming the max_chunk_size parameter to chunk_size across various files and methods, and optimizing the sentence splitting method for better performance and accuracy. Below are the most important changes:

Parameter Renaming:

Sentence Splitting Optimization:

These changes aim to improve the code's readability and performance, particularly in handling large text chunks and splitting sentences efficiently.