scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.56k stars 465 forks source link

ISO 8601 YYYY-MM-DD parsing depends on locale #1125

Open mauvilsa opened 1 year ago

mauvilsa commented 1 year ago

An ISO 8601 with dashes YYYY-MM-DD is quite common and I would think that there isn't much chance to confuse with other formats. However, the parsing depends on the locale leading to incorrectly parsed dates. Examples:

>>> dateparser.parse('1991-05-11')
datetime.datetime(1991, 5, 11, 0, 0)  # correct
>>> dateparser.parse('1991-05-11', locales=["en"])
datetime.datetime(1991, 5, 11, 0, 0)  # correct
>>> dateparser.parse('1991-05-11', locales=["de"])
datetime.datetime(1991, 11, 5, 0, 0)  # wrong!
>>> dateparser.parse('1991-05-11', locales=["es"])
datetime.datetime(1991, 11, 5, 0, 0)  # wrong!
>>> print(dateparser.parse('1991-05-17', locales=["de"]))
None  # wrong!

Is this the expected behavior, or is it just a bug?

Note that the input can be in many formats including ISO, which is why I want to give locales. But I do need ISO to work correctly.