Open Rammurthy5 opened 4 years ago
Looks a network connectivity issue. Is your computer connected to the Internet through a corporate firewall? If you start a python3 session on your Windows box, does the following code run without any exception?
import urllib
r = urllib.request.urlopen('https://archive.apache.org/dist/pdfbox/')
data = r.read()
No I couldn't. it throws the same error. How do I fix it? can I add proxy address to it ?
You can try setting the environmental variable http_proxy
or https_proxy
(depending on the protocol) to the URI of your proxy before importing pdfbox.
Another possibility is to set the user agent to that of a common web browser, as some firewalls block HTTP requests that do not appear to come from the latter; try the following code and see whether it throws an error:
import urllib
req = urllib.request.Request(
url='https://archive.apache.org/dist/pdfbox/',
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0'
}
)
r = urllib.request.urlopen(req)
data = r.read()
When I merely import pdbox, and initiate the PDFBox() function, it immediately throws an error message as following. Please help