Closed debuggerpk closed 8 years ago
The problem's probably with your class.
It works for me (OS X, Python 2.7.11, Pillow 3.3.1) when calling this:
import io, requests
from PIL import Image
response = requests.get('http://www.worldbank.org/content/dam/wbr/About/Pres/jyk-hs-offical.png')
response.raw.decode_content = True
image = Image.open(io.BytesIO(response.content))
print image.size
And when calling the def:
def _fetch_image_size(image_url):
size = None
if '.svg' not in image_url:
response = requests.get(image_url)
if response.status_code == 200:
response.raw.decode_content = True
try:
image = Image.open(io.BytesIO(response.content))
size = image.size
except (IOError, OSError) as error:
print error
print image_url
response.close()
return size
_fetch_image_size("http://www.worldbank.org/content/dam/wbr/About/Pres/jyk-hs-offical.png")
But note that's with headers=self._headers
removed, and I don't know what you've got in there, and that could be causing problems.
Does it work for you with this def? What's your full class? Strip it down as far as possible to resemble your simpler calls, and see at which point the problem occurs. It'll be easier if you put each in a script and run them from there.
thankyou for your response. I have diagnosed the problem. pasting the problematic code from my class here.
def _extract_image_urls(self, soup):
"""
extracts all the <img src=''> tags
Args:
soup (obj): the BeautifulSoup object
Returns:
url (str): string for url
"""
for img in soup.findAll("img", src=True):
yield urlparse.urljoin(self._url, img["src"])
this above code gets me all the urls i need to pass onto my _fetch_image_size()
function.
i modified my _fetch_image_size
function to have this
.....
response = requests.get(image_url, headers=self._headers)
print 'Request URL: {url}'.format(url=image_url)
print 'Response URL: {url}'.format(url=response.url)
.....
and here is the response.
s = LinkScraper('http://www.worldbank.org/en/about/president/about-the-office/bio') Request URL: http://www.worldbank.org/content/dam/wbr/img/mobile-menu-lines.png Response URL: http://www.worldbank.org/content/dam/wbr/img/mobile-menu-lines.png
Request URL: http://www.worldbank.org/etc/designs/wbr/clientlibs/img/icon-search-black.png Response URL: http://www.worldbank.org/etc/designs/wbr/clientlibs/img/icon-search-black.png
Request URL: http://www.worldbank.org/content/dam/wbr/About/Pres/jyk-hs-offical.png Response URL: http://www.worldbank.org/404_response.htm Error: cannot identify image file <_io.BytesIO object at 0x112414830>
the response that is being passed into the PIL.Image function is an http response. Nothing wrong with PIL here. I need to sanitize my urls to look for blank spaces here maybe.
thankyou @hugovk for the response.
What did you do?
called the
Image.open()
inside the class, a piece of code that works when called on interpreter.What did you expect to happen?
give me the size of the image
What actually happened?
raised an error. see below
What versions of Pillow and Python are you using?
python 2.7, pillow 3.3.1
The problem statement
The function below, part of a bigger class works fine on all the images except this one -
when the above function is called as part of the class object, it raises me this error.
however on the command line interpreter, when i do this
the output is,
i am unable to figure out why is it happening? attaching the screenshot.
I am not really sure whether to raise it here or anywhere else