wcember / pypub

Python library to programatically create epub files
MIT License
278 stars 44 forks source link

Bug fix suggestion #9

Closed limkokhole closed 4 years ago

limkokhole commented 6 years ago

chapter.py:

def get_image_type(url):
    for ending in ['jpg', 'jpeg', '.gif' '.png']:
        if url.endswith(ending):
            return ending
    else:
        try:
            f, temp_file_name = tempfile.mkstemp()
            urllib.urlretrieve(url, temp_file_name)
            image_type = imghdr.what(temp_file_name)
            return image_type
        except IOError:
return None

This single method has 3 bugs:

  1. Lack of url = url.lower() since sometime extension can be uppercaser, it causes redundant http request to detect the image type.
  2. '.gif' '.png'] missing a comma, so ".gif .png" causes .png and .gif never met. Also missing '.bmp' which imghdr will not recognize.
  3. The checking should change to if url.endswith(ending) or ((ending + '?') in url):, or else it missing images with ?parameters which itself is a html contains inner img src, and the imghdr will not recognize it but ePUB editor and web browser able to render it.

Second place is constants.py, seems like both 'code' and pre tags not included. It causes sample code in https://security.googleblog.com/2009/03/reducing-xss-by-way-of-automatic.html get drop, but sample code is important. Also