Open andyflury opened 9 months ago
@andyflury thank you!
Hey @langchain4j can i pick this up ?
@mike-adonis sure, go ahead! Thank you!
@mike-adonis did you manage to start/implement it?
Yes I did start, just been busy with a few things I can try to complete it by the end of the weekend
@langchain4j please find the pr for this #690
langchain4j already offers several DocumentSplitters. One that is currently missing is MarkdownHeaderTextSplitter.
The original (python based) langchain project has such a MarkdownHeaderTextSplitter
Would be nice to have this as well in langchain4j.
Attached an implementation for this. I used ChatGPT to translate the Python code to java (incl. the Unit Test). It might not conform to all the coding standards of the project, but it does the job and tests pass.
MarkdownHeaderTextSplitter.zip