xenova / chat-downloader

A simple tool used to retrieve chat messages from livestreams, videos, clips and past broadcasts. No authentication needed!
https://chat-downloader.readthedocs.io/
MIT License
902 stars 127 forks source link

[BUG] Wrong parsing of Twitch timestamps if running lib not in UTC timezone #252

Open nevmerzhitsky opened 1 month ago

nevmerzhitsky commented 1 month ago

Basic information

Describe the bug

The chat_downloader.utils.core.timestamp_to_microseconds function, which is used only in sites/twitch.py, doesn't parse the TZ marker (Z) properly. Because of this, the resulting microseconds value is shifted out of UTC by the TZ of the current python process.

Command/Code used

Reproducing scripts:

port datetime, time, os
from chat_downloader.utils.core import timestamp_to_microseconds

now = datetime.datetime.now()

test_timestamp = "2024-06-30T00:35:54.54Z"
wrong = timestamp_to_microseconds(test_timestamp)

# Fix variant #1
import re
info = list(filter(None, re.split(r'[\.|Z]{1}', test_timestamp))) + [0]
fixed1 = round((datetime.datetime.strptime(f'{info[0]}Z', '%Y-%m-%dT%H:%M:%S%z').timestamp() + float(f'0.{info[1]}')) * 1e6)
#                                                                           ^^ fix

# Fix variant #2 (it applies ISO 8601 instead of RFC3339)
fixed2 = int(datetime.datetime.fromisoformat(test_timestamp).timestamp() * 1e6)

print(
    f"{os.environ['TZ']=}\n"
    f"{time.tzname=}\n"
    f"----\n"
    f"{now=}\n"
    f"{now.tzinfo=}\n"
    f"{now.tzname()=}\n"
    f"{now.timetz()=}\n"
    f"{now.astimezone()=}\n"
    f"----\n"
    f"{test_timestamp=}\n"
    f" {wrong=}\n"
    f"{fixed1=}\n"
    f"{fixed2=}\n",
    flush=True
)

My local output:

os.environ['TZ']='America/Chicago'
time.tzname=('CST', 'CDT')
----
now=datetime.datetime(2024, 7, 22, 18, 49, 55, 724731)
now.tzinfo=None
now.tzname()=None
now.timetz()=datetime.time(18, 49, 55, 724731)
now.astimezone()=datetime.datetime(2024, 7, 22, 18, 49, 55, 724731, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=68400), 'CDT'))
----
test_timestamp='2024-06-30T00:35:54.54Z'
 wrong=1719725754540000
fixed1=1719707754540000
fixed2=1719707754540000

Expected behavior

The value must be 1719707754540000.

Additional context/information

Only Python 3.7 is appropriate to fix the problem because %z and datetime.datetime.fromisoformat() appears at 3.7.