Closed Kiris-tingna closed 6 years ago
like this
def image_fetch(self, url: str): response = requests.get(url, headers={"User-Agent": make_random_useragent()}, stream=True, timeout=(3.05, 10)) payload = urlparse(url).path _left_bound_pos = payload.rfind('/') _right_bound_pos = payload.find('.', _left_bound_pos) if (payload[_right_bound_pos + 1:] == 'jpeg' or payload[_right_bound_pos + 1:] == 'jpg') and \ response.headers['Content-Type'] == 'image/jpeg': _ext = '.jpeg' elif payload[_right_bound_pos + 1:] == 'gif' or response.headers['Content-Type'] == 'image/gif': _ext = '.gif' else: _ext = '.jpeg' return payload[_left_bound_pos + 1:_right_bound_pos] + _ext
can be an util add to this framework @xianhu
爬虫框架不考虑你抓什么,只考虑通用型、稳定性和易扩展性
like this