Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.75k stars 456 forks source link

[feat] Add shape operation tools #72

Closed lolipopshock closed 2 years ago

lolipopshock commented 2 years ago

Layout Parser now comes with better support for various shape operations. Built based on the generalized_connected_component_analysis_1d function, the simple_line_detection can be used for detecting text lines based on the token textblocks:

import layout parser as lp
page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0]
pdf_lines = lp.simple_line_detection(page_layout)
Token Visualization Text Line Visualization
image image