lebedov / python-pdfbox

Python interface to Apache PDFBox command-line tools.
Other
75 stars 24 forks source link

Can we get font name and size of the text from pfg? #30

Closed jigsawcoder closed 2 years ago

jigsawcoder commented 2 years ago

I want to know how to get the fontname and size of the text extracted from pdf using pdbox python? If yes, how?

Thanks

lebedov commented 2 years ago

From what I can tell, this info is not accessible via the pdfbox command-line interface wrapped by python-pdfbox; you will have to wrap the relevant part of the Java API yourself if you want to access it from Python - see this gist for an example of how to do so. This post may also be helpful.