scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.55k stars 465 forks source link

Bare separator parsed as current date #568

Closed lopuhin closed 4 years ago

lopuhin commented 5 years ago

In [104]: dateparser.parse('//')
Out[104]: datetime.datetime(2019, 9, 20, 0, 0)

In [105]: dateparser.parse('/')
Out[105]: datetime.datetime(2019, 9, 20, 0, 0)

In [106]: dateparser.parse('.')
Out[106]: datetime.datetime(2019, 9, 20, 0, 0)

In [108]: dateparser.parse('..')
Out[108]: datetime.datetime(2019, 9, 20, 0, 0)

In [109]: dateparser.parse('--')
Out[109]: datetime.datetime(2019, 9, 20, 0, 0)

I would expect None to be returned.

lopuhin commented 5 years ago

And some other common examples which should return None I think:

In [151]: dateparser.parse('of a')
Out[151]: datetime.datetime(2019, 1, 20, 0, 0)

In [152]: dateparser.parse('of an')
Out[152]: datetime.datetime(2019, 1, 20, 0, 0)

In [153]: dateparser.parse('an')
Out[153]: datetime.datetime(2019, 1, 20, 0, 0)

In [154]: dateparser.parse('a')
Out[154]: datetime.datetime(2019, 1, 20, 0, 0)

In [159]: dateparser.parse('a', settings={'STRICT_PARSING': True}, languages=['en'])
Out[159]: datetime.datetime(1900, 1, 1, 1, 0)

In [160]: dateparser.parse('an', settings={'STRICT_PARSING': True}, languages=['en'])
Out[160]: datetime.datetime(1900, 1, 1, 1, 0)
rennerocha commented 5 years ago

dateparser considers as valid when all tokens are separators, and if so, the default date value the current date.

noviluni commented 4 years ago

Hi @lopuhin . As the original ticket was fixed by @rennerocha , I decided to create a new ticket: https://github.com/scrapinghub/dateparser/issues/655 containing your second comment avoiding confusion. :)

lopuhin commented 4 years ago

Thank you @noviluni