Open snehashimpi opened 4 years ago
Please post an example...
Due to security , I can not post example. The problem is, the bold text in pdf table gets repeated. for example, if text is = 'Project test' , after parsing it is like 'Project test test' or 'Project test\rtest', etc.
Same problem
Came across the same issue.
PDF:
Extracted DF:
Code:
tables = camelot.read_pdf('foo.pdf', pages='5', flavor = 'stream')
Linux-5.4.0-47-generic-x86_64-with-glibc2.29 Python 3.8.2 (default, Jul 16 2020, 14:00:26) [GCC 9.3.0] NumPy 1.18.3 OpenCV 4.4.0 Camelot 0.8.2
Please update if there is a workaround. Thanks,
Hi, While PDF table extraction using camelot python, if there is bold text in PDF table, its coming multiple times in JSON object. Can not figure out why is this ? Is there any parameter we can set which extracts PDF table without any text formatting ?