Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.78k stars 459 forks source link

group_blocks_by_distance example addition #48

Open zthab opened 3 years ago

zthab commented 3 years ago

Describe the bug In the example given here the function group_blocks_by_distance doesn't sort within row in the x direction. I came up with a simple fix to this as implemented below, and thought I would flag it for anyone else who runs into this issue.

To Reproduce Steps to reproduce the behavior:

  1. What command or script did you run?

    
    def group_blocks_by_distance(blocks, distance_th):
    
    blocks = sorted(blocks, key = lambda x: x.coordinates[1])
    distances = np.array([b2.coordinates[1] - b1.coordinates[3] for (b1, b2) in zip(blocks, blocks[1:])])
    
    distances = np.append([0], distances)
    block_group = (distances>distance_th).cumsum()
    
    grouped_blocks = [lp.Layout([]) for i in range(max(block_group)+1)]
    for i, block in zip(block_group, blocks):
        grouped_blocks[i].append(block)
    
    return grouped_blocks
The changes that I implement here allow for within row sorting on the x axis, either from left to right or from right to left depending on the parameter passed.

left to right if x_direction = 0, right to left if x_direction = 1

def group_blocks_by_distance(blocks, distance_th, x_direction): blocks = sorted(blocks, key = lambda x: (x.coordinates[1])) distances = np.array([b2.coordinates[1] - b1.coordinates[3] for (b1, b2) in zip(blocks, blocks[1:])])

distances = np.append([0], distances)
block_group = (distances>distance_th).cumsum()
grouped_blocks = [[] for i in range(max(block_group)+1)]
for i, block in zip(block_group, blocks):
    grouped_blocks[i].append(block)
for i in range(len(grouped_blocks)):
    grouped_blocks[i] = sorted(grouped_blocks[i], 
                               key = lambda x: (x.coordinates[0]), 
                               reverse= x_direction)

grouped_sorted_blocks = [lp.Layout(grouped_blocks[i]) for i in range(max(block_group)+1)]

return grouped_sorted_blocks

**Environment**
I'm on Mac, using layoutparser version 0.2.0, working in a conda virtual environment. 

**Screenshots**
Lets say I have the following row. 
![image](https://user-images.githubusercontent.com/22862614/122645552-efda8e00-d0e8-11eb-836c-d0ffb6a68440.png)
The original function reads it as 
![image](https://user-images.githubusercontent.com/22862614/122645571-fcf77d00-d0e8-11eb-8a71-7fe9a81c8c6d.png)
My modification reads it as 
![image](https://user-images.githubusercontent.com/22862614/122645635-45af3600-d0e9-11eb-9bcd-0197ae11344d.png)
SAIVENKATARAJU commented 2 years ago

Hi @zthab , Thanks for your fix. However, in the given example by creators they specifically extracting residency, lotno manually. but if we want to extract all the columns in the table, do we still need to find the coordinates from start columns to end column manually using other tools?. I am really having challenges to convert this table GXMH31H.pdf to dataframe