Open SuryaThiru opened 1 month ago
@SuryaThiru Thank you for highlighting this intriguing issue. We are students from the University of Toronto and would be delighted to look into it further.
@SuryaThiru We’d like to propose modifying the split_text method to reset relevant attributes at the start of each invocation. This change will ensure that each call processes input independently without carrying over any previous state.
We would appreciate any feedback from the community on this approach. We are looking forward to your thoughts!
Checked other resources
Example Code
Files
Files.zip
Error Message and Stack Trace (if applicable)
Output
Description
I was testing out the
ExperimentalMarkdownSyntaxTextSplitter
class due to issues with whitespacing in theMarkdownHeaderTextSplitter
. I noticed that the class was mixing up text between subsequentsplit_text
calls.I do not believe this is intended. Please find the attached zip to reproduce the issue. Happy to help fix the issue.
Let me know if there are stable alternatives to achieve splitting by markdown headers in the mean time.
System Info
python -m langchain_core.sys_info
System Information
Package Information
Optional packages not installed
Other Dependencies