DS4SD / docling

Get your documents ready for gen AI
https://ds4sd.github.io/docling
MIT License
10.48k stars 507 forks source link

Add Markdown-based table serialization in chunking #342

Open vagenas opened 1 week ago

vagenas commented 1 week ago

Requested feature

Currently table serialization in the HierarchicalChunker is only available in the form of triplets.

Enable Markdown-based table serialization too.

Ideally, consider the scenario where the user is interested in a different representation for different scenarios, e.g.: