xenova / chat-downloader

A simple tool used to retrieve chat messages from livestreams, videos, clips and past broadcasts. No authentication needed!
https://chat-downloader.readthedocs.io/
MIT License
902 stars 127 forks source link

[BUG] YouTube changed something. Unable to parse initial video data. #219

Closed pboettcher closed 1 year ago

pboettcher commented 1 year ago

Basic information

Describe the bug

Doesn't download. Unable to parse initial video data.

Command/Code used

./chat_downloader https://www.youtube.com/watch?v=xDf2mq_v7II

If running from the command line, provide the following:

  1. The command used (including the verbose tag, -v):
    /usr/local/bin/chat_downloader -v https://www.youtube.com/watch?v=xDf2mq_v7II > chd.txt 2>&1
  2. Output from the above command:
    
    [DEBUG] Python version: 3.11.0a4 (main, Aug 19 2022, 10:37:08) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
    [DEBUG] Program version: 0.2.7
    [DEBUG] Initialisation parameters: {'headers': None, 'cookies': None, 'proxy': None}
    [DEBUG] Created YouTubeChatDownloader session.
    [INFO] Site: youtube.com
    [DEBUG] Program parameters: {'url': 'https://www.youtube.com/watch?v=xDf2mq_v7II', 'start_time': None, 'end_time': None, 'max_attempts': 15, 'retry_timeout': None, 'interruptible_retry': True, 'timeout': None, 'inactivity_timeout': None, 'max_messages': None, 'message_groups': ['messages'], 'message_types': None, 'output': None, 'overwrite': True, 'sort_keys': True, 'indent': 4, 'format': 'youtube', 'format_file': None, 'chat_type': 'live', 'ignore': None, 'message_receive_timeout': 0.1, 'buffer_size': 4096}
    [DEBUG] Starting new HTTPS connection (1): www.youtube.com:443
    [DEBUG] https://www.youtube.com:443 "GET /watch?v=xDf2mq_v7II HTTP/1.1" 302 0
    [DEBUG] Starting new HTTPS connection (1): consent.youtube.com:443
    [DEBUG] https://consent.youtube.com:443 "GET /m?continue=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DxDf2mq_v7II%26cbrd%3D1&gl=DE&m=0&pc=yt&cm=4&hl=en&src=1 HTTP/1.1" 200 None
    [DEBUG] <!doctype html><html lang="en" dir="ltr"><head><base href="https://consent.youtube.com/"><meta name="referrer" content="origin"><link rel="canonical" href="https://consent.youtube.com/m"><meta name="viewport" content="initial-scale=1,minimum-scale=1,maximum-scale=5,width=device-width"><link rel="shortcut icon" href="//www.google.com/favicon.ico"><script data-id="_gd" nonce="XGV6Ko3fPZ71Pc5KTU8cNw">window.WIZ_global_data = {"Aocu9c":false,"DndLYb":"","DpimGf":false,"EP1ykd":["/_/*","/signedin","/signedin/*"],"FdrFJe":"-565059968392955702","FoW6je":false,"GVlsxf":"www.google.com","Im6cmf":"/_/ConsentUi","LVIXXb":1,"LoQv7e":false,"MT7f9b":[],"MZbqOb":"//www.google.com/setprefs?cs\u003d2\u0026sig\u003d0_YSBC3pon58bCdLhAosNVfbN2s6Q%3D\u0026prev\u003dhttps://consent.youtube.com/m?continue%3Dhttps://www.youtube.com/watch?v%253DxDf2mq_v7II%2526cbrd%253D1%26gl%3DDE%26m%3D0%26pc%3Dyt%26cm%3D4%26hl%3Den%26src%3D1\u0026hl\u003den","MuJWjd":false,"Mypbod":"https://www.googleapis.com/reauth","PYFuDc":"DUMMY_X_CLIENT_DATA_WIZ_GLOBAL_KEY_DO_NOT_USE","QrtxK":"","S06Grb":"","S6lZl":128566913,"TSDtV":"%.@.[[null,null,\"CAMSCR0A1/KTEIb5BA\\u003d\\u003d\"]]]","TTHqvb":"https://kidsmanagement-pa.googleapis.com","UDh4De":"//www.google.com/setprefs?cs\u003d0\u0026sig\u003d0_YSBC3pon58bCdLhAosNVfbN2s6Q%3D\u0026prev\u003dhttps://consent.youtube.com/m?continue%3Dhttps://www.youtube.com/watch?v%253DxDf2mq_v7II%2526cbrd%253D1%26gl%3DDE%26m%3D0%26pc%3Dyt%26cm%3D4%26hl%3Den%26src%3D1\u0026hl\u003den","Vvafkd":false,"Y12zTb":"//www.google.com/setprefs?cs\u003d1\u0026sig\u003d0_YSBC3pon58bCdLhAosNVfbN2s6Q%3D\u0026prev\u003dhttps://consent.youtube.com/m?continue%3Dhttps://www.youtube.com/watch?v%253DxDf2mq_v7II%2526cbrd%253D1%26gl%3DDE%26m%3D0%26pc%3Dyt%26cm%3D4%26hl%3Den%26src%3D1\u0026hl\u003den","Yllh3e":"

...............Output is too long. Github refuses to create a post with long body. I have truncated it .............

¬їа¬†","",false],["ta","தமிழ்","",false],["te","తెలుగు","",false],["kn","аІ•аІЁаіЌаІЁаІЎ","",false],["ml","മലയാളം","",false],["si","සිංහල","",false],["th","ไทย","",false],["lo","ລາວ","",false],["my","бЂ™бЂјбЂ”бЂєбЂ™бЂ¬","",false],["km","ខ្មែរ","",false],["ko","н•њкµ­м–ґ","",false],["ja","ж—Ґжњ¬иЄћ","",false],["zh-CN","з®ЂдЅ“дё­ж–‡","",false],["zh-TW","з№Ѓй«”дё­ж–‡","",false],["zh-HK","з№Ѓй«”дё­ж–‡","香港",false]]], sideChannel: {}}); [ERROR] Unable to parse initial video data. Please report this at https://github.com/xenova/chat-downloader/issues/new/choose [DEBUG] Session closed.



## Expected behavior
Chat download starts.

## Screenshots
If applicable, add screenshots to help explain your problem.

## Additional context/information
Add any other context or information about the problem here.
pboettcher commented 1 year ago

Now, when the stream is no more live and has been processed by YouTube, I can download chat replay without any problems with the same command line. But live chat has its own value because it contains deleted messages and messages before and after the stream. And of course, author may restrict access to the stream after broadcasting, so there will be no downloadable chat replay available.

ollydev commented 1 year ago

Same issue here on every yt stream. I attached the full log as OP didn't: chd.txt

Edit: It's some kind of confirmation screen, using cookies from my desktop works. https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-pass-cookies-to-youtube-dl

chat = ChatDownloader(cookies='cookies.txt').get_chat
mascarell commented 1 year ago

Same issue here.

I personally would not want to use a cookies file, happy to donate if anyone can look into it as I use the library for a bot while streaming

armislv commented 1 year ago

Consent cookie has changed. I changed _initialize_consent() in the youtube.py accordingly to this: https://github.com/yt-dlp/yt-dlp/commit/378ae9f9fb8e8c86e6ac89c4c5b815b48ce93620 and it works now.

HSDRC commented 1 year ago

Confirmed, in youtube.py file _initialize_consent should look like this:

    def _initialize_consent(self):
        if self.get_cookie_value('__Secure-3PSID'):
            return
        socs = self.get_cookie_value('SOCS')
        if socs and not socs.value.startswith('CAA'):  # not consented
            return
        self.set_cookie_value('.youtube.com', 'SOCS', 'CAI', secure=True)  # accept all (required for mixes)

Then all things work back.

pboettcher commented 1 year ago

Thank you. I have changed this directly in the Python37-32\Lib\site-packages\chat_downloader\sites and now it works. Waiting for release.

xenova commented 1 year ago

Thanks for identifying the problem! @armislv, @HSDRC, or @pboettcher, would one of you like to submit a PR for this?

pboettcher commented 1 year ago

Thanks for identifying the problem! @armislv, @HSDRC, or @pboettcher, would one of you like to submit a PR for this?

Done. Please do not judge strictly because it is my first pull request.

mirabilos commented 1 year ago

I can confirm that pboettcher’s diff fixes the issue for me. Thanks <3

pboettcher commented 1 year ago

I can confirm that pboettcher’s diff fixes the issue for me. Thanks <3

It is actually the code of @HSDRC in there. I was just practicing making the pull request.