xianhu / PSpider

简单易用的Python爬虫框架,QQ交流群:597510560
https://github.com/xianhu/PSpider
BSD 2-Clause "Simplified" License
1.83k stars 504 forks source link

consider an image fetch and saver ? #24

Closed Kiris-tingna closed 6 years ago

Kiris-tingna commented 6 years ago

like this

def image_fetch(self, url: str):
        response = requests.get(url, headers={"User-Agent": make_random_useragent()}, stream=True, timeout=(3.05, 10))
        payload = urlparse(url).path
        _left_bound_pos = payload.rfind('/')
        _right_bound_pos = payload.find('.', _left_bound_pos)

        if (payload[_right_bound_pos + 1:] == 'jpeg' or payload[_right_bound_pos + 1:] == 'jpg') and \
                        response.headers['Content-Type'] == 'image/jpeg':
            _ext = '.jpeg'
        elif payload[_right_bound_pos + 1:] == 'gif' or response.headers['Content-Type'] == 'image/gif':
            _ext = '.gif'
        else:
            _ext = '.jpeg'

        return payload[_left_bound_pos + 1:_right_bound_pos] + _ext
Kiris-tingna commented 6 years ago

can be an util add to this framework @xianhu

xianhu commented 6 years ago

爬虫框架不考虑你抓什么,只考虑通用型、稳定性和易扩展性