TyVik / rating_kinopoisk

Выгружаем свои оценки фильмов с kinopoisk.ru.
GNU General Public License v3.0
9 stars 5 forks source link

ERROR: Spider error processing -- капча? #1

Open anxieuse opened 2 years ago

anxieuse commented 2 years ago

Возникла ошибка при запуске скрипта: $ scrapy.exe crawl kinopoisk -o ratings.csv -t csv -a user_id=59546852 Полный лог: https://pastebin.com/SYVBkNh2

... 2022-02-06 17:05:47 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.kinopoisk.ru/showcaptcha?cc=1&retpath=https%3A//www.kinopoisk.ru/us er/59546852/votes/list/ord/date/perpage/200%3F_d174268107deece13b300d8cdf3b6bcd&t=2/1644156373/d31d4f018740bb75b13a38e6df1bd851&u=d1ed6e63-5d950ef7-bd8c86ef- 3609cc1a&s=bf5f1752845a3fbfbeab222b13d55f85> (referer: None) Traceback (most recent call last): File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\utils\defer.py", line 120, in iter_errback yield next(it) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\utils\python.py", line 353, in next return next(self.data) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\utils\python.py", line 353, in next return next(self.data) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\core\spidermw.py", line 56, in _evaluate_iterable for r in iterable: File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\spidermiddlewares\offsite.py", line 29, in process_spider_output
for x in result: File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\core\spidermw.py", line 56, in _evaluate_iterable for r in iterable: File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\spidermiddlewares\referer.py", line 342, in return (_set_referer(r) for r in result or ()) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\core\spidermw.py", line 56, in _evaluate_iterable for r in iterable: File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 40, in return (r for r in result or () if _filter(r)) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\core\spidermw.py", line 56, in _evaluate_iterable for r in iterable: File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\spidermiddlewares\depth.py", line 58, in return (r for r in result or () if _filter(r)) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\scrapy\core\spidermw.py", line 56, in _evaluate_iterable for r in iterable: File "C:\Users\leo\kp2imdb\rating_kinopoisk\scraper\scraper\spiders\kinopoisk.py", line 38, in parse_list count = int(response.css('table.fontsize10 tr:first-child td')[0].root.text) File "c:\users\leo\appdata\local\programs\python\python39\lib\site-packages\parsel\selector.py", line 70, in getitem o = super(SelectorList, self).getitem(pos) IndexError: list index out of range 2022-02-06 17:05:47 [scrapy.core.engine] INFO: Closing spider (finished) 2022-02-06 17:05:47 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 1652, 'downloader/request_count': 3, 'downloader/request_method_count/GET': 3, 'downloader/response_bytes': 8502, 'downloader/response_count': 3, 'downloader/response_status_count/200': 2, 'downloader/response_status_count/302': 1, 'elapsed_time_seconds': 0.486238, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2022, 2, 6, 14, 5, 47, 302651), 'httpcompression/response_bytes': 1277, 'httpcompression/response_count': 1, 'log_count/DEBUG': 3, 'log_count/ERROR': 1, 'log_count/INFO': 10, 'response_received_count': 2, 'robotstxt/request_count': 1, 'robotstxt/response_count': 1, 'robotstxt/response_status_count/200': 1, 'scheduler/dequeued': 2, 'scheduler/dequeued/memory': 2, 'scheduler/enqueued': 2, 'scheduler/enqueued/memory': 2, 'spider_exceptions/IndexError': 1, 'start_time': datetime.datetime(2022, 2, 6, 14, 5, 46, 816413)} 2022-02-06 17:05:47 [scrapy.core.engine] INFO: Spider closed (finished)

TyVik commented 2 years ago

А кука от профиля установлена в scraper.spiders.KinopoiskSpider.COOKIES? Обычным пользователям капча вроде не показывается.