def get_image_type(url):
for ending in ['jpg', 'jpeg', '.gif' '.png']:
if url.endswith(ending):
return ending
else:
try:
f, temp_file_name = tempfile.mkstemp()
urllib.urlretrieve(url, temp_file_name)
image_type = imghdr.what(temp_file_name)
return image_type
except IOError:
return None
This single method has 3 bugs:
Lack of url = url.lower() since sometime extension can be uppercaser, it causes redundant http request to detect the image type.
'.gif' '.png'] missing a comma, so ".gif .png" causes .png and .gif never met. Also missing '.bmp' which imghdr will not recognize.
The checking should change to if url.endswith(ending) or ((ending + '?') in url):, or else it missing images with ?parameters which itself is a html contains inner img src, and the imghdr will not recognize it but ePUB editor and web browser able to render it.
Second place is constants.py, seems like both 'code' and pre tags not included. It causes sample code in https://security.googleblog.com/2009/03/reducing-xss-by-way-of-automatic.html get drop, but sample code is important. Also
chapter.py
:This single method has 3 bugs:
url = url.lower()
since sometime extension can be uppercaser, it causes redundant http request to detect the image type.'.gif' '.png']
missing a comma, so ".gif .png" causes .png and .gif never met. Also missing '.bmp' whichimghdr
will not recognize.if url.endswith(ending) or ((ending + '?') in url):
, or else it missing images with?parameters
which itself is a html contains inner img src, and theimghdr
will not recognize it but ePUB editor and web browser able to render it.Second place is
constants.py
, seems like both 'code' andpre
tags not included. It causes sample code inhttps://security.googleblog.com/2009/03/reducing-xss-by-way-of-automatic.html
get drop, but sample code is important. Also