eubinecto / youtora

Search YouTube videos like you search books
4 stars 0 forks source link

sslerror with scrapers #168

Open eubinecto opened 3 years ago

eubinecto commented 3 years ago

why?

INFO:scrape_multi:=== scraping video_raw:(done=271, skipped=10, total=1528) ===
INFO:_scrape_video_info:loading video_info...
WARNING: video doesn't have subtitles
INFO:_scrape_main_html:loading main_html...
INFO:exec:video_raw saved: #272
INFO:parse:FOUND:auto:ko
INFO:parse:FOUND:auto:ja
INFO:parse:FOUND:auto:en
WARNING:parse:NOT FOUND:no manual nor auto:en-GB
INFO:parse:FOUND:auto:fr
INFO:_scrape_raw_xml:loading raw xml...:https://www.youtube.com/api/timedtext?v=2kE8La9drWA&asr_langs=de%2Cen%2Ces%2Cfr%2Cit%2Cja%2Cko%2Cnl%2Cpt%2Cru&caps=asr&xorp=true&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1603968901&sparams=ip%2Cipbits%2Cexpire%2Cv%2Casr_langs%2Ccaps%2Cxorp%2Cxoaf&signature=C8BFC2831F250916841A3ADA8424D27533A780DC.6B4EEB66441D7FE776E8C0DCF79C447DAD55F610&key=yt8&kind=asr&lang=ko&tlang=ko&fmt=srv1
INFO:exec:tracks_raw saved #1
INFO:_scrape_raw_xml:loading raw xml...:https://www.youtube.com/api/timedtext?v=2kE8La9drWA&asr_langs=de%2Cen%2Ces%2Cfr%2Cit%2Cja%2Cko%2Cnl%2Cpt%2Cru&caps=asr&xorp=true&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1603968901&sparams=ip%2Cipbits%2Cexpire%2Cv%2Casr_langs%2Ccaps%2Cxorp%2Cxoaf&signature=C8BFC2831F250916841A3ADA8424D27533A780DC.6B4EEB66441D7FE776E8C0DCF79C447DAD55F610&key=yt8&kind=asr&lang=ko&tlang=ja&fmt=srv1
INFO:exec:tracks_raw saved #2
INFO:_scrape_raw_xml:loading raw xml...:https://www.youtube.com/api/timedtext?v=2kE8La9drWA&asr_langs=de%2Cen%2Ces%2Cfr%2Cit%2Cja%2Cko%2Cnl%2Cpt%2Cru&caps=asr&xorp=true&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1603968901&sparams=ip%2Cipbits%2Cexpire%2Cv%2Casr_langs%2Ccaps%2Cxorp%2Cxoaf&signature=C8BFC2831F250916841A3ADA8424D27533A780DC.6B4EEB66441D7FE776E8C0DCF79C447DAD55F610&key=yt8&kind=asr&lang=ko&tlang=en&fmt=srv1
Traceback (most recent call last):
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 978, in _validate_conn
    conn.connect()
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/connection.py", line 362, in connect
    self.sock = ssl_wrap_socket(
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 384, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/urllib3/util/retry.py", line 439, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.youtube.com', port=443): Max retries exceeded with url: /api/timedtext?v=2kE8La9drWA&asr_langs=de%2Cen%2Ces%2Cfr%2Cit%2Cja%2Cko%2Cnl%2Cpt%2Cru&caps=asr&xorp=true&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1603968901&sparams=ip%2Cipbits%2Cexpire%2Cv%2Casr_langs%2Ccaps%2Cxorp%2Cxoaf&signature=C8BFC2831F250916841A3ADA8424D27533A780DC.6B4EEB66441D7FE776E8C0DCF79C447DAD55F610&key=yt8&kind=asr&lang=ko&tlang=en&fmt=srv1 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1108)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "manage.py", line 22, in <module>
    main()
  File "manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
    utility.execute()
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/django/core/management/__init__.py", line 395, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/django/core/management/base.py", line 328, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/django/core/management/base.py", line 369, in execute
    output = self.handle(*args, **options)
  File "/Users/eubin/Desktop/Projects/Big/youtora/youtora/collect/management/commands/scrape.py", line 31, in handle
    ScrapeYouTubeRaws.exec(channel_id, lang_code)
  File "/Users/eubin/Desktop/Projects/Big/youtora/youtora/collect/facades.py", line 66, in exec
    for track_idx, tracks_raw in enumerate(tracks_raw_gen):
  File "/Users/eubin/Desktop/Projects/Big/youtora/youtora/collect/scrapers.py", line 105, in <genexpr>
    cls.scrape(caption)
  File "/Users/eubin/Desktop/Projects/Big/youtora/youtora/collect/scrapers.py", line 94, in scrape
    raw_xml = cls._scrape_raw_xml(caption.url)
  File "/Users/eubin/Desktop/Projects/Big/youtora/youtora/collect/scrapers.py", line 113, in _scrape_raw_xml
    response = requests.get(caption_url)  # first, get the response (download)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/Users/eubin/Desktop/Projects/Big/youtora/ytrenv/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.youtube.com', port=443): Max retries exceeded with url: /api/timedtext?v=2kE8La9drWA&asr_langs=de%2Cen%2Ces%2Cfr%2Cit%2Cja%2Cko%2Cnl%2Cpt%2Cru&caps=asr&xorp=true&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1603968901&sparams=ip%2Cipbits%2Cexpire%2Cv%2Casr_langs%2Ccaps%2Cxorp%2Cxoaf&signature=C8BFC2831F250916841A3ADA8424D27533A780DC.6B4EEB66441D7FE776E8C0DCF79C447DAD55F610&key=yt8&kind=asr&lang=ko&tlang=en&fmt=srv1 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1108)')))
INFO:scrape_multi:=== scraping video_raw:(done=272, skipped=10, total=1528) ===
eubinecto commented 3 years ago

possible cause for this error?

Probably due to unstable internet connection.