Two tables on the same page are extracted as one

Describe the bug

On a PDF document where there are multiple tables on the same page, both of them are extracted as one table. This causes misaligned and empty columns.

Code to reproduce the problem

import pdfplumber
import pandas as pd

file = pdfplumber.open("example document.pdf")

page = file.pages[0]
tables = page.extract_tables()
df = pd.DataFrame(tables[0][1:], columns=table[0][0])

PDF file

example document.pdf

This is an example document with redacted information. Real documents have a similar structure, but with more pages, sometimes with multiple and different tables.

Expected behavior

There should be two tables:

One with columns: ODSOTNI UČITELJ/ICA, URA, RAZRED, UČILNICA, NADOMEŠČA, PREDMET, OPOMBA
Another with columns: RAZRED, URA, UČITELJ/ICA, PREDMETA, UČILNICA, OPOMBA

Text between those two tables (MENJAVA UR) should not be in any of tables.

Actual behavior

What actually happened, instead?

Tables are extracted as one. This causes a lot of extra and misaligned columns with None:

actual table

Screenshots

table finder debug

Environment

pdfplumber version: 0.9.0
Python version: 3.10
OS: Windows

Additional context

PDFs that I need to parse all share a similar layout with around 5 different table formats. However, not all files use all of those table formats, and not all tables have the same height, making it impossible to just crop the page before extracting tables.

Maybe this can be solved by adding an option to only extract tables where there are visible lines around cells?

jsvine / pdfplumber