xenova / chat-downloader

A simple tool used to retrieve chat messages from livestreams, videos, clips and past broadcasts. No authentication needed!
https://chat-downloader.readthedocs.io/
MIT License
952 stars 133 forks source link

[BUG] chat_downloader fails with complaint about emoji characters in the chat #121

Closed delovelady closed 3 years ago

delovelady commented 3 years ago

Basic information

Describe the bug

When trying to download recent chats from YouTube, receiving consistent failures (from my extreme novice stance, looks like some sort of emoticon(s) causing an issue).

Command/Code used Any of these examples will fail:

/usr/local/bin/chat_downloader --format json --output /path/yW5-gMRjzd4.json https://youtu.be/yW5-gMRjzd4 /usr/local/bin/chat_downloader --format json --output /path/yW5-gMRjzd4.json https://youtu.be/Wz0cd2F0B-Y /usr/local/bin/chat_downloader --format json --output /path/yW5-gMRjzd4.json https://youtu.be/ZlrMa3tSEJc

I don't know enough about Python to be able to diagnose at all.

If running from the command line, provide the following:

  1. The command used (including the verbose tag, -v):
    /usr/local/bin/chat_downloader -v --format json --output /path/yW5-gMRjzd4.json https://youtu.be/yW5-gMRjzd4
  2. Output from the above command:
    [INFO] Site: youtube.com
    [INFO] Retrieving chat for "Friday Sunrise with Wendell Live from Casco, Maine (or thereabouts)".
    0:18 | (Moderator) Michael Blade Fpv: Wowsers that’s beautiful
    0:37 | (Moderator) Drone Bum: Buffering a bit...quite a bit
    0:45 | (Moderator) Wendell Live! - Vegas Tribute Artist: Hi Susan
    Traceback (most recent call last):
    File "/usr/local/bin/chat_downloader", line 8, in <module>
    sys.exit(main())
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/cli.py", line 170, in main
    run(**args.__dict__)
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/chat_downloader.py", line 297, in run
    for message in chat:
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/common.py", line 60, in __iter__
    yield from self.chat
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/youtube.py", line 1141, in _get_chat_messages
    data = self._parse_item(original_item, data)
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/youtube.py", line 412, in _parse_item
    BaseChatDownloader.remap(
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/common.py", line 280, in remap
    new_value = remap.remap_function(remap_input)
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/youtube.py", line 386, in parse_runs
    'is_custom_emoji': emoji['isCustomEmoji']
    KeyError: 'isCustomEmoji'
    dennis@hpmicro1:/usr/local/bin   09/06 20:41:25
    > /usr/local/bin/chat_downloader -v --format json --output /path/yW5-gMRjzd4.json https://youtu.be/yW5-gMRjzd4
    [DEBUG] Python version: 3.8.10 (default, Jun  2 2021, 10:49:15)
    [GCC 9.4.0]
    [DEBUG] Program version: 0.0.8
    [DEBUG] Initialisation parameters: {'cookies': None, 'proxy': None, 'headers': None}
    [INFO] Site: youtube.com
    [DEBUG] Program parameters: {'url': 'https://youtu.be/yW5-gMRjzd4', 'start_time': None, 'end_time': None, 'max_attempts': 15, 'retry_timeout': None, 'timeout': None, 'max_messages': None, 'logging': 'debug', 'pause_on_debug': False, 'exit_on_debug': False, 'message_groups': ['messages'], 'message_types': None, 'format': 'json', 'format_file': None, 'chat_type': 'live', 'message_receive_timeout': 0.1, 'buffer_size': 4096, 'inactivity_timeout': None}
    [DEBUG] Starting new HTTPS connection (1): www.youtube.com:443
    [DEBUG] https://www.youtube.com:443 "GET /watch?v=yW5-gMRjzd4 HTTP/1.1" 200 None
    [DEBUG] Chat information: {'chat': <generator object YouTubeChatDownloader._get_chat_messages at 0x7fd46310bf90>, 'title': 'Friday Sunrise with Wendell Live from Casco, Maine (or thereabouts)', 'duration': 1758, 'is_live': False, 'start_time': 1630102790059414.0, 'site': <chat_downloader.sites.youtube.YouTubeChatDownloader object at 0x7fd4631c9c40>, 'format': <function ChatDownloader.get_chat.<locals>.<lambda> at 0x7fd46314da60>}
    [INFO] Retrieving chat for "Friday Sunrise with Wendell Live from Casco, Maine (or thereabouts)".
    [DEBUG] Getting Live chat (Live chat replay).
    [DEBUG] https://www.youtube.com:443 "GET /live_chat_replay?continuation=op2w0wRgGlhDaWtxSndvWVZVTnZXVGxmUVRaMFdsWk1ZVVUwWkVwaFZWVmpWMUZCRWd0NVZ6VXRaMDFTYW5wa05Cb1Q2cWpkdVFFTkNndDVWelV0WjAxU2FucGtOQ0FCQAFyAggB HTTP/1.1" 200 None
    [DEBUG] Session closed.
    0:18 | (Moderator) Michael Blade Fpv: Wowsers that’s beautiful
    0:37 | (Moderator) Drone Bum: Buffering a bit...quite a bit
    0:45 | (Moderator) Wendell Live! - Vegas Tribute Artist: Hi Susan
    Traceback (most recent call last):
    File "/usr/local/bin/chat_downloader", line 8, in <module>
    sys.exit(main())
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/cli.py", line 170, in main
    run(**args.__dict__)
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/chat_downloader.py", line 297, in run
    for message in chat:
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/common.py", line 60, in __iter__
    yield from self.chat
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/youtube.py", line 1141, in _get_chat_messages
    data = self._parse_item(original_item, data)
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/youtube.py", line 412, in _parse_item
    BaseChatDownloader.remap(
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/common.py", line 280, in remap
    new_value = remap.remap_function(remap_input)
    File "/home/dennis/.local/lib/python3.8/site-packages/chat_downloader/sites/youtube.py", line 386, in parse_runs
    'is_custom_emoji': emoji['isCustomEmoji']
    KeyError: 'isCustomEmoji'

    If the output is too long, you can attach a text file or remove output which does not constitute to the problem.

Otherwise, if using the python module, provide the following:

  1. A minimal reproducible example:
    # python code
  2. Output, traceback or other information relating to the bug:
    [output or description]

Expected behavior

A clear and concise description of what you expected to happen. Expected to have two outputs: stdout should contain the clear-text chat json file (/path/yW5-gMRjzd4.json) should have json representation of same chat information (but with more detail)

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context/information

Add any other context or information about the problem here.

I have an "outer" script that is kicked off when a new chat livestream is detected on my YouTube channel. The script, among other things, calls chat_downloader with the above options, to give a searchable record of the chat for nostalgic and research reasons.

This application (chat_downloader) had been working great for me (also chat-replay-downloader, which seems to be (almost) the same program)... but recently - last three weeks or so - it has consistently failed. I suspect my viewers (or a viewer) has discovered some emoji feature not encountered before.

xenova commented 3 years ago

Program version: 0.0.8

Please upgrade to the latest version (0.1.9). This was fixed a few months ago in https://github.com/xenova/chat-downloader/commit/1941c9cdb6feddcbea9a7a6b7643dcab429aa738

You can upgrade using

pip3 install chat-downloader --upgrade
delovelady commented 3 years ago

Thanks so much @xenova ! Solved it!