MarcoCaglia / yt-content-analysis

Contains code to source information about YouTube content creators and their videos and visualize it in a Django dashboard.
GNU General Public License v3.0
1 stars 0 forks source link

Critical Bug: Cannot scrape likes anymore #11

Open MarcoCaglia opened 3 years ago

MarcoCaglia commented 3 years ago

Traceback (most recent call last):
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/utils/defer.py", line 120, in iter_errback
    yield next(it)
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/utils/python.py", line 353, in __next__
    return next(self.data)
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/utils/python.py", line 353, in __next__
    return next(self.data)
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 62, in _evaluate_iterable
    for r in iterable:
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
    for x in result:
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 62, in _evaluate_iterable
    for r in iterable:
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/spidermiddlewares/referer.py", line 340, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 62, in _evaluate_iterable
    for r in iterable:
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 62, in _evaluate_iterable
    for r in iterable:
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/home/marco/.cache/pypoetry/virtualenvs/yt-content-analysis-l8OwqWUW-py3.8/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 62, in _evaluate_iterable
    for r in iterable:
  File "/home/marco/Documents/repositories/MarcoCaglia/yt-content-analysis/content_sourcing/asmr_scraper/spiders/content_crawler.py", line 89, in get_video_info
    item["likes"] = response.selector.css("yt-formatted-string::attr(aria-label)").extract()[0]```
MarcoCaglia commented 3 years ago

Cannot reproduce this bug anymore.

MarcoCaglia commented 3 years ago

Can be reproduced by enabling Job pausing. Even if it is not actively used.