fix input to text_in_bbox in stream.py

From the definition of the text_in_bbox function, it is expected to receive the parameters in the (x1, y1, x2, y2) order:

def text_in_bbox(bbox, text):
    """Returns all text objects present inside a bounding box.

    Parameters
    ----------
    bbox : tuple
        Tuple (x1, y1, x2, y2) representing a bounding box where
        (x1, y1) -> lb and (x2, y2) -> rt in the PDF coordinate
        space.
    text : List of PDFMiner text objects.

    Returns
    -------
    t_bbox : list
        List of PDFMiner text objects that lie inside table.

    """
    lb = (bbox[0], bbox[1])
    rt = (bbox[2], bbox[3])
    t_bbox = [
        t
        for t in text
        if lb[0] - 2 <= (t.x0 + t.x1) / 2.0 <= rt[0] + 2
        and lb[1] - 2 <= (t.y0 + t.y1) / 2.0 <= rt[1] + 2
    ]
    return t_bbox

However, in the call to this function on line 305 in the stream.py module, this order is mixed up:

region_text = text_in_bbox((x1, y2, x2, y1), self.horizontal_text) This commit reorders them. This may be a problem on line 317.

Apologies for any faux-pas, this is my first ever contribution to a project!

atlanhq / camelot

fix input to text_in_bbox in stream.py #399