xenova / chat-downloader

A simple tool used to retrieve chat messages from livestreams, videos, clips and past broadcasts. No authentication needed!
https://chat-downloader.readthedocs.io/
MIT License
948 stars 132 forks source link

Active YouTube livestreams result in error (sometimes) #18

Closed jasonkhanlar closed 4 years ago

jasonkhanlar commented 4 years ago

"Video does not have a chat replay."

I noticed that that appears after a livestream is finished but before the live chat is available.

However, for active livestreams that in progress, as I test this more than the first occurrence I experienced, for a 2nd different active livestream, I wasn't able to reproduce it because the script output the same as above, however, trying a few livestreams at https://www.youtube.com/results?search_query=live&sp=EgJAAQ%253D%253D to reproduce the error, I can find other livestreams that result in the same error. Due to the nature of time-sensitivity, I can't provide any specific examples that will work in future, but I'm sure you can find one to reproduce the same error experience where the script fails with:

Traceback (most recent call last):
  File "chat_replay_downloader.py", line 449, in <module>
    chat_messages = chat_downloader.get_chat_replay(
  File "chat_replay_downloader.py", line 404, in get_chat_replay
    return self.get_youtube_messages(match.group(1), start_time, end_time, message_type)
  File "chat_replay_downloader.py", line 332, in get_youtube_messages
    data = self.__parse_item(item)
  File "chat_replay_downloader.py", line 235, in __parse_item
    self.__time_to_seconds(data['time_text']))
KeyError: 'time_text'

Note: The line numbers might be off by 1 because I edited the script per my last issue report

The data['time_text'] doesn't exist, but I see that some of the data is available. For example this was the value of data:

{'message': 'Oh ty', 'author': 'ā€¢ Lacy ā€¢', 'timestamp': '1603504526543501'}

Perhaps for active livestreams that are in progress, the python script can, instead of using already post-processed calculated time values, the script can generate these calculations or otherwise remove them?

I can't find a way to get the unix epoch timestamp for when a livestream started, so that it can be used to calculate time elapsed for each message in data that doesn't contain 'time_text'

xenova commented 4 years ago

Thanks for posting! Let me look into it. šŸ˜ƒ

xenova commented 4 years ago

I can't find a way to get the unix epoch timestamp for when a livestream started, so that it can be used to calculate time elapsed for each message in data that doesn't contain 'time_text'

This was a big issue I had early into development for the tool. Without calling the API (I want to ensure this requires no authentication and can be used by anyone), there are no proper ways to get the start time. One way I thought of solving this was to work backwards: They give a time stamp, and I calculate the start time using the epoch time the message was sent (e.g. subtracting 0:10 from the epoch time). However, this yielded inaccurate times.

xenova commented 4 years ago

Let me know if that works! šŸ˜ƒ I made it print out the timestamp instead of the time text (which doesn't exist). There were also some other issues to fix with other live streams.

As you mentioned above, it is difficult to give examples (due to time sensitivity), so we sort of need to solve errors on a per-case basis. If you find any errors, I'll do my best to fix it asap. šŸ‘