mbakholdina / lib-tcpdump-processing

Library designed to process .pcap(ng) tcpdump or Wireshark trace files and extract SRT packets of interest for further analysis
15 stars 3 forks source link

Fix in CEST timezone bug #25

Closed Sorkanius closed 3 years ago

Sorkanius commented 4 years ago

I ran on the same problem mentioned in Issue #22 this might work to fix it

Example of the outputs and difference with previous method:

date_string_cet = 'Apr 01, 2020 16:29:53.150479 CET' date_string_cest = 'Apr 01, 2020 16:29:53.150479 CEST'

The previous command specified the format: pd.to_datetime(date_string_cet, format='%b %d, %Y %H:%M:%S.%f %Z')

Which gave an output of: Timestamp('2020-04-01 16:29:53.150479+0200', tz='CET')

Now by not specifying the format, using pd.to_datetime(date_string_cet) outputs: Timestamp('2020-04-01 16:29:53.150479+0200', tz='pytz.FixedOffset(120)')

And also works with pd.to_datetime(date_string_cest): Timestamp('2020-04-01 16:29:53.150479+0200', tz='pytz.FixedOffset(120)')

Although the information of the timezone is lost, the offset from UTC is kept, which might be enough for not losing any functionality.

mbakholdina commented 4 years ago

@Sorkanius Thanks for your PR! ) At the very beginning, I've had no data format specified until I started merging dataframes from tshark and SRT and figured out that tshark gives me mm:dd and srt-xtransmit dd:mm or vice versa (do not remember exactly). That's why I've added the format when parsing dates and it worked with CET until the time switched to CEST. For whatever reason it does not like CEST. Seems to be I need to take a look at this additionally.

Sorkanius commented 4 years ago

I see. Maybe a simple replacement might work then, something like:

date_string_cest.replace('CEST', 'CET')

I can update the PR if the solution sounds fit.

mbakholdina commented 3 years ago

Closing as a duplication of #35. I haven't found a proper solution quickly so decided to merge this as a workaround until the proper fix for #22 is done.