This pull request replaces the TokenTextSplitter class with the RecursiveCharacterTextSplitter class in the langchain.text_splitter module. The RecursiveCharacterTextSplitter works better on textual documents like PDFs because it keeps sentences and paragraphs together. Additionally, this pull request adds the CHUNK_OVERLAP environment variable, which allows users to specify the chunk overlap for text splitting.
This pull request replaces the TokenTextSplitter class with the RecursiveCharacterTextSplitter class in the langchain.text_splitter module. The RecursiveCharacterTextSplitter works better on textual documents like PDFs because it keeps sentences and paragraphs together. Additionally, this pull request adds the CHUNK_OVERLAP environment variable, which allows users to specify the chunk overlap for text splitting.