xavctn / img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
MIT License
571 stars 76 forks source link

Error during extract tables #204

Open alexbevz opened 4 months ago

alexbevz commented 4 months ago

Error:

File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\document\image.py", line 46, in extract_tables extracted_tables = super(Image, self).extract_tables(ocr=ocr, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\document\base__init__.py", line 126, in extract_tables tables = {idx: TableImage(img=img, ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\document\base__init.py", line 127, in min_confidence=min_confidence).extract_tables(implicit_rows=implicit_rows, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\tables\image.py", line 118, in extract_tables self.extract_bordered_tables(implicit_rows=implicit_rows) File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\tables\image.py", line 76, in extract_bordered_tables self.tables = get_tables(cells=cells, ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\tables\processing\bordered_tables\tables\init.py", line 29, in get_tables complete_clusters = [add_semi_bordered_cells(cluster=cluster, lines=lines, char_length=char_length) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\tables\processing\bordered_tables\tables\init__.py", line 29, in complete_clusters = [add_semi_bordered_cells(cluster=cluster, lines=lines, char_length=char_length) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\Desktop\tdm-catalog-collector.venv\Lib\site-packages\img2table\tables\processing\bordered_tables\tables\semi_bordered.py", line 19, in add_semi_bordered_cells x_min, x_max = min([c.x1 for c in cluster]), max([c.x2 for c in cluster]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: min() arg is an empty sequence

The 'cluster' variable can be an empty list. I could fix it myself, but there is no time to make changes.