Table extraction bug when lines are just barely end-to-end

Describe the bug

Via https://github.com/jsvine/pdfplumber/discussions/1087#discussioncomment-8564694, it seems that there's a bug in how pdfplumber joins lines.

Have you tried repairing the PDF?

Yes.

Code to reproduce the problem

Download the PDF in the linked comment. Then:

import pdfplumber
pdf = pdfplumber.open("2022.Sustainability.Report_NYSE_WM_2022.pdf")
page = pdf.pages[41]
im = page.to_image()
im.reset().debug_tablefinder({
    "join_x_tolerance": 0
})

And compare to:

(
    im.reset()
    .draw_lines(
        pdfplumber.table.merge_edges(
            pdfplumber.utils.filter_edges(page.edges, "h"),
            snap_x_tolerance=0,
            snap_y_tolerance=0,
            join_x_tolerance=-1,
            join_y_tolerance=0,
        )
    )
)

PDF file

See linked issue.

Expected behavior

pdfplumber's table-finding approach should merge all the sub-lines in each visual line into a single line.

Actual behavior

The method appears to do something strange with the lines, "finding" only certain portions of them.

Screenshots

See above

Environment

pdfplumber version: 0.11.0
Python version: 3.10.4
OS: Mac

jsvine / pdfplumber

Table extraction bug when lines are just barely end-to-end #1110