galkahana / pdf-text-extraction

cli for extracting text from PDF files (and maybe possibly tables)
Apache License 2.0
74 stars 19 forks source link

fix: missing intersection check #15

Closed galkahana closed 1 year ago

galkahana commented 1 year ago

while there was merging algorithm for vertical and horizontal lines there isnt something that connects them in the earlier bfs...which can incorrectly identify one table as multiples. so added intersection edges for vertical and horizontal lines that continue each other

Also, added debug config for msvc