certtools / intelmq

IntelMQ is a solution for IT security teams for collecting and processing security feeds using a message queuing protocol.
https://docs.intelmq.org/latest/
GNU Affero General Public License v3.0
976 stars 297 forks source link

generic csv parser: timezone correction #461

Open sebix opened 8 years ago

sebix commented 8 years ago

Additional option for generic csv parser: timezone correction.

The timezone offset is often not given in the time-column, so it should be defined manually.

Possible configuration format: +10:00, -8 etc. No abbreviations, they are not unique in most cases.

kalyparker commented 5 years ago

I just discovered the generic parser choose to convert the date by himself... My source is in UTC, my server is in local time (+2 or +1 depending of the season...)

Question: Is there a reason why this information is change? I don't want it. UTC is fine for the storage.

And remark on this topic, use format like +10:00, -8 does not work in most of European country when there is a change time summer/winter.

ghost commented 5 years ago

Yeah, depends on your local time zone. Does setting the env variable TZ=UTC help?

And remark on this topic, use format like +10:00, -8 does not work in most of European country when there is a change time summer/winter.

it should also be possible to use the names like CET. But if your data's timezone depends on time, that's tricky anyway and IMO that's your data's (or the creator's) fault. What should intelmq guess then? Is 2019-03-31T02:30:00 CET or CEST?

kalyparker commented 5 years ago

Yes it is better with the env variable. I realize that all data collected until now are not in UTC :( A parameter in intelmq conf should be great for avoiding it, don't you think ? Let's the user decide ;)

Anyway, if I understand this issue, we simply add another parameter compatible with tzinfo. And for being sure, parameter is for knowing the format of the source, not destination, right ? Destination is determined by the env variable.

kalyparker commented 5 years ago

After some tests, It appear the timezone is fine. My problem comes from this line:

if key in ["time.source", "time.destination"]: as I had another field using Datetime in the harmonization file.

I replace by

if key.startswith("time."):

Back to the subject, I try to add a parameter for using tzstr and/or pytz. It is such a nightmare. Which one do you want? And do you want to use this parameter on 4 options (timestamp, windows_nt, epoch_millis and None) ? Any idea where to start?

ghost commented 5 years ago

Yes, timezones are a real nightmare - especially in python

Is there any circumstance where input data lacking timezone information is not rejected by intelmq? It should always actually (except for parsers where tz information is given as fallback value). If yes, could you please describe this?