DS4SD / docling

Get your documents ready for gen AI
https://ds4sd.github.io/docling
MIT License
10.48k stars 507 forks source link

Docling <page_assemble_model> reading order algorithm #358

Closed mllife closed 4 days ago

mllife commented 4 days ago

Question

...

I see the flow is detecting page layout using the rtdetr model, which have label classes from doclaynet along with some additional labels, then use tableformer for table structure, but what exactly you do for reading order? I see some clustering mentioned in the code for page assemble? Can you explain the code and add some example to explain it? Any test or sample code I can follow to understand this better?

mllife commented 4 days ago

@PeterStaar-IBM , can you help with this?