Open apache135 opened 4 years ago
Without the file, it is difficult to help you. One-page PDF, showing the issue, can help.
Sorry for the encrypted file, I will find a similar one and upload . Thanks
I found the colored font is text annote , which seems like added on the original pdf file via a pdf editor . Does Camelot support to extract text annote. I can’t find the relative information on the docs .
Hello Thanks for this great lib which bring much convenience to me.
I want to reflect two problems I met with it.
When the table has one cell which contains text with blue color and no background, it can’t extent the content in it.
I have a table which has 3 rows and 3 cols. The last row is. [‘is it a word?’,’ yes’, ‘’] after extraction, it returns the last row content is [‘is it a word?is it a word?’, ‘yes yes’,’’] Each cell has been repeated to return. The parameters I pass to read pdf is line_scale =30 split_text=True and the table regions
Sorry for that I cant upload the pdf file, if possible,could u offer some tips for troubleshooting?