Closed Mankvis closed 1 year ago
Hi @Mankvis, and thanks for your interest in this library. The page you've shared is, unfortunately, an image-based PDF page, containing no digital information directly from text. You can confirm this by trying to select the text from the page and paste it into a text document.
See this comment for an example of what you could do next: https://github.com/jsvine/pdfplumber/discussions/717#discussioncomment-3476384
Describe the bug
When I use extract_text and extract_words the input is empty
Code to reproduce the problem
with pdfplumber.open(pdf_path) as pdf: first_page = pdf.pages[0] print(first_page.extract_text())
PDF file
demo.pdf
Expected behavior
The Chinese text content of the pdf that should be output
Environment