marieai / marie-ai

Integrate AI-powered Document Analysis Pipelines
MIT License
60 stars 5 forks source link

Implement better line detection and refinement algorithm #68

Open gregbugaj opened 1 year ago

gregbugaj commented 1 year ago

To improve PDF generation we need better line detection and refinement strategy. Current method work well, however, by utilizing deep learning we can improve the aggregation and detection.

At this same time we should move away from using an INDEX as line index but rather an ID as it is possible to have to lines that are spaced apart.