akoumjian / datefinder

Find dates inside text using Python and get back datetime objects
http://datefinder.readthedocs.org/en/latest/
MIT License
635 stars 167 forks source link

Random datetimes extracted from nowhere #145

Open hkristof03 opened 3 years ago

hkristof03 commented 3 years ago

Hi!

I found your this library on stackoverflow: https://stackoverflow.com/questions/53759224/how-to-write-a-regex-to-validate-date-format-of-type-day-month-dd-yyyy

Although it successfully extracted the timestamps that appear in the following log, it produces artificial ones as well:

s = 'Jan 3 21:02:15 dp214a-c11-35-eor.net.something.intra Jan 3 2021 21:02:15+01:00 dp214a-c11-35-eor '

list(datefinder.find_dates(s)) ->

[datetime.datetime(2021, 1, 3, 21, 2, 15), datetime.datetime(2035, 11, 9, 0, 0), datetime.datetime(2021, 1, 3, 21, 2, 15, tzinfo=tzoffset(None, 3600)), datetime.datetime(2035, 11, 9, 0, 0)]

Could you propose an idea how can I solve this issue?

Greetings

sahild11 commented 3 years ago

list(filter(lambda x: len(x[1])>5, list(datefinder.find_dates(s, source=True))))

[(datetime.datetime(2021, 1, 3, 21, 2, 15), 'Jan 3 21:02:15'), (datetime.datetime(2021, 1, 3, 21, 2, 15, tzinfo=tzoffset(None, 3600)), 'Jan 3 2021 21:02:15+01:00')]