scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.54k stars 464 forks source link

Parsing "3pm" fails if RELATIVE_BASE is timezone-aware #1213

Open drefrome opened 8 months ago

drefrome commented 8 months ago

In version 1.2.0 this works fine:

import datetime
import pytz

dateparser.parse(
    "3pm", 
    settings={
        "TIMEZONE": "America/Los_Angeles", 
        "TO_TIMEZONE": "America/Los_Angeles", 
        "RETURN_AS_TIMEZONE_AWARE": True, 
        "RELATIVE_BASE": datetime.datetime.now(), 
        "PREFER_DATES_FROM": "future"})

but this returns None:

dateparser.parse("3pm", 
    settings={
        "TIMEZONE": "America/Los_Angeles", 
        "TO_TIMEZONE": "America/Los_Angeles", 
        "RETURN_AS_TIMEZONE_AWARE": True, 
        "RELATIVE_BASE": datetime.datetime.now(tz=pytz.timezone("America/Los_Angeles")), 
        "PREFER_DATES_FROM": "future"})

In an earlier version of dateparser, the returned date was sometimes not as expected because it would use now in UTC by default for RELATIVE_BASE, so I started passing the timezone-aware RELATIVE_BASE setting. With the upgrade to 1.2.0, the same code stopped working.

drefrome commented 8 months ago

This appears to be a problem since 1.1.8. AFAICT, I can't get the behavior I want with a naive timezone for RELATIVE_BASE for a certain set of parse() calls, so I've downgraded back to 1.1.7.

If my local time in PST is 8:15am and ... a) I call 1.1.7 parse("9am") withRELATIVE_BASE = timezone-aware now and the above settings, I get back 9am today PST b) I call 1.1.7 parse("9am") with RELATIVE_BASE = naive now and above settings, I get 9am the following day PST c) I call 1.1.7 parse("9am") with no RELATIVE_BASE, same result as (b) because that's the library default

With these settings, when it's 8:15am and I pass "9am" to parse, I expect to get back "9am" today in my current timezone, which is what (a) returns to me.

Starting with 1.1.8, (a) doesn't run anymore -- it will always return None.