Open saidakyuz opened 1 year ago
According to my observation, this happens only if the cells are spanned vertically and horizontally and there are some other cells that are not spanned horizontally on the same column with the cells two-dimensional spanned. Somehow each cell in the same row could have opposite values of vspan. (True or False) The issue caused by this attribute, but I still have no solution for it.
Followint tables have also same issue
Describe the bug
I am extracting data from PDFs using camelot and am faced with the following issue on 3. page of this datasheet. The problematic table is shown below:
The issue is inconsistency during the copying content of span cells. As you can see on the following picture span cells are correctly detected.
Even if the cells are detected correctly in the 3. column the content is copied to one of two spanned cells and in the 4. column the content is copied to two of three spanned cells. You can see the data I extracted as follow. There is always one missing cell per both columns.
Steps to reproduce the bug
Expected behavior
Code
to visualize the tables:
PDF HERE
Screenshots
Environment
Additional context