scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.5k stars 466 forks source link

v1.2.0 returns wrong month with future dates #1202

Open realtimeprojects opened 7 months ago

realtimeprojects commented 7 months ago

tested on 2023-11-27, 15:33 UTC:

v1.1.8:

>>> import dateparser
>>> dateparser.parse("Friday 15:53", settings={'PREFER_DATES_FROM': 'future'})
datetime.datetime(2023, 12, 1, 15, 53)

v1.2.0:

>>> import dateparser
>>> dateparser.parse("Friday 15:53", settings={'PREFER_DATES_FROM': 'future'})
datetime.datetime(2023, 11, 1, 15, 53)
jgayfer commented 5 months ago

A workaround is to set the relative base with an explicit time zone (tested on Wednesday January 31st).

>>> import dateparser
>>> from datetime import datetime, timezone
>>> dateparser.parse("Thursday at 6pm")
datetime.datetime(2024, 1, 1, 18, 0)
>>> dateparser.parse("Thursday at 6pm", settings={"RELATIVE_BASE": datetime.now(timezone.utc)})
datetime.datetime(2024, 2, 1, 18, 0)

EDIT: This may have been because it's late enough in January, that changing the timezone pushed me over into February.

jgayfer commented 5 months ago

I have a hunch this change is what caused the issue: https://github.com/scrapinghub/dateparser/pull/1179

There is a change from datetime.datetime.utcnow() to datetime.datetime.now(datetime.timezone.utc), which changes us from a naive timestamp to a timezone aware timestamp. Most notably here.

However, this is purely a hunch.