Open SWHL opened 2 years ago
Curious to know how you get this exact value of ratio = 300 / 72
and does it work for another pdf?
When the camelot package obtains the box coordinates by the pdfminer package, whose resolution's default value is 72 (I fogot to where I saw it), but when the camelot obtains the image by the read_pdf function, whose resolution's default value is 300. https://github.com/atlanhq/camelot/blob/cd8ac7979fe3631866fe439f07e9d6aaa5b1e5c6/camelot/io.py#L93
You can try others.
@SWHL Tis really helped me to understand the conversion. However i have a similar problem in which i have a coordinates of an object got it from a page image(pdf page have been converted into page image). Now i want to convert these coordinates into camelot pdf level coordinates. I tried to follow above logic in reverse order which is not successful. I am new to this, any leads can give some hints/logic for page image co-ordinates conversion to pdf level co-ordinates ? i have object coordinates - x0,y0,x1,y1 (from page image), also have page image width and height. Also holding target pdf height n width. Ex: (x0,y0,x1,y1) = 188, 393, 1576, 1498 pageImage height,width = (3300, 2550) pdf height,width = (792, 612)
@baleris You can try it by this:
\frac{2550}{612} = \frac{188}{x} \rightarrow x?
\frac{3300}{792} = \frac{393}{y} \rightarrow y?
@SWHL, this has not worked, when i checked camelot detected table coordinates they are totally different. For example for the above mentioned coordinates, camelot's relevant coordinates are (72.0, 295.2, 563.04, 648.72)
@SWHL i see in your above solution you are getting a page image from img = table._image[0]
if i have a borderless table and i would like to pass flavor = ''stream' : camelot.read_pdf(src,flavor = 'stream') in tis case how could i get image ? If i try to do same like table._image[0] i get an error message.
Any suggestions to get image for "stream" parameter/borderless tables ?
You can refer this: https://github.com/atlanhq/camelot/blob/cd8ac7979fe3631866fe439f07e9d6aaa5b1e5c6/tests/test_common.py#L35-L40
The current issue is beyond the scope of this issue. Suggest opening a new issue to discuss.
@SWHL as suggested i have raised new issue #497
Checklist
Describe the bug
Environment
OS
: CentOS 7Python
: 3.7.11camelot-py
: 0.10.1Reproduction
Run the following code: (foo.pdf)
Bug fix