Open karndeepsingh opened 1 year ago
You are asking for a complete document layout task! This is not an issue, its a task. Combine object detection (bigger bboxes) with pdf_parser output (bboxes for every word or line). Filter the lines/words output by the bigger boxes predicted by Vision Models. You can leverage spatial correlation (sort by width, then height) to identify words in the same line or a heading above a paragraph (heading will be one-liner, identified a bbox with bigger area than others plus height of heading < height of paragraph). Hope that helps 👯
Hi, I have been trying to implement the Newspaper navigator model for my application. However, it is able to detect the regions like title or whole article. But I want to extract title and its below paragraphs for my usecase. How I can do that? Please help me to resolve this issue. Is their any tutorial available to guide on it?
Thanks