Closed archerne closed 2 years ago
Your 9
is coming from the date that you ran the code. It didn't recognize the 10
as a date and so is using dateutil's base_date.
The position of the PM
is throwing it off. It is highly unusual to place PM both before a hours/minutes and right after a date. If you run it without the time at the end:
In [15]: text = "2020 FEB 10 PM"
In [16]: print(next(datefinder.find_dates(text, source=True)))
(datetime.datetime(2020, 2, 29, 22, 0), '2020 FEB 10 PM')
You can see that it saw 10 PM
as it is making it's way through the text and rightfully cast it as a time. It didn't see anything that looked like a day of the month, so it defaulted to today.
Then when the regex finds 4:52
, it says "wait, this is a time!" and it uses that time instead because it is more verbose or because it finds it later and overwrites. So your 10 is getting dropped altogether.
Any idea on why
2020 FEB 10 PM 4:52
parses as2020-02-09 04:52:00
?