This pull request includes updates to the TokenChunker class and its documentation to improve usability and performance. Key changes include the addition of multiprocessing for batch chunking, enhanced documentation, and the ability to handle both single and batch inputs using the __call__ method.
Documentation Improvements:
DOCS.md: Added detailed descriptions for key parameters and methods of the TokenChunker class, along with example usage.
Code Enhancements:
src/chonkie/chunker/base.py: Added multiprocessing support for the chunk_batch method to improve performance by parallelizing the chunking process. [1][2]
src/chonkie/chunker/base.py: Updated the __call__ method to handle both single strings and lists of strings, raising an error for invalid input types.
This pull request includes updates to the
TokenChunker
class and its documentation to improve usability and performance. Key changes include the addition of multiprocessing for batch chunking, enhanced documentation, and the ability to handle both single and batch inputs using the__call__
method.Documentation Improvements:
DOCS.md
: Added detailed descriptions for key parameters and methods of theTokenChunker
class, along with example usage.Code Enhancements:
src/chonkie/chunker/base.py
: Added multiprocessing support for thechunk_batch
method to improve performance by parallelizing the chunking process. [1] [2]src/chonkie/chunker/base.py
: Updated the__call__
method to handle both single strings and lists of strings, raising an error for invalid input types.