Closed Fabrice-64 closed 1 year ago
Thank you for creating this issue. It looks to me like the issue is that you have two import statements in one line.
Can you please try this:
import spacy
from spacypdfreader import pdf_reader
nlp = spacy.load("en_core_web_sm")
doc = pdf_reader("tests/data/test_pdf_01.pdf", nlp)
# Get the page number of any token.
print(doc[0]._.page_number) # 1
print(doc[-1]._.page_number) # 4
# Get page meta data about the PDF document.
print(doc._.pdf_file_name) # "tests/data/test_pdf_01.pdf"
print(doc._.page_range) # (1, 4)
print(doc._.first_page) # 1
print(doc._.last_page) # 4
# Get all of the text from a specific PDF page.
print(doc._.page(4)) # "able to display the destination page (unless..."
If you continue to have issues, please also share:
python --version
pip freeze
Problem finally solved.
I created a new venv.
python --version Python 3.9.13
I use VS-Code on MacOS Ventura 13.3.1
I had to install a bunch of additional packages each time an error was thrown. You'll find them in the attached requirements.txt
requirements.txt
In addition to this I had to slightly adapt this code using maybe a recent change on Spacy page related to Spacypdfreader project.
from spacypdfreader.spacypdfreader import pdf_reader
Then everything worked.
Following the instructions for spacypdfreader
import spacy from spacypdfreader import pdf_reader
I get the following error message:Traceback (most recent call last): File "/Users/my_name/apprendre-dev/pdfreader/spacy.py", line 1, in <module> import spacy File "/Users/my_name//apprendre-dev/pdfreader/spacy.py", line 2, in <module> from spacypdfreader import pdf_reader ImportError: cannot import name 'pdf_reader' from 'spacypdfreader' (/Users/my_name//apprendre-dev/pdfreader/venv/lib/python3.10/site-packages/spacypdfreader/__init__.py)
I get exactly the same result on colab and conda... is there a change in the packages which have not been reported in the user guide ?