scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.55k stars 465 forks source link

Written Number Parsing Not as Expected #1236

Open darakelian opened 1 month ago

darakelian commented 1 month ago

Based on the description of the library and the various examples, I had assumed English spelling of words such as "twenty", "thirty", "forty", etc. would be properly parsed by this library. Upon investigating the code, it seems that for English, this library only parses the spelled out numbers 1-12: https://github.com/scrapinghub/dateparser/blob/master/dateparser/data/date_translation_data/en.py#L789 which honestly was a surprise. Is there any way to support parsing something like "in twenty minutes"? Would you guys be open to a PR adding this extra support?

gutsytechster commented 3 days ago

Hi @darakelian

Could you please provide any example where this doesn't work out? I can see that it works well for me

>>> import dateparser
>>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2024, 10, 27, 2, 10, 38, 594131)
>>> dateparser.parse('in 20 mins')
datetime.datetime(2024, 10, 27, 2, 30, 41, 701795)
>>> dateparser.parse('in 40 mins')
datetime.datetime(2024, 10, 27, 2, 50, 47, 316997)
darakelian commented 3 days ago

Hi, as I mentioned in the text I specifically am seeing issues with the spelled out versions (i.e. "twenty" instead of "20") as can be seen here:

>>> import dateparser
>>> from datetime import datetime
>>> datetime.now()
datetime.datetime(2024, 10, 26, 20, 12, 53, 898538)
>>> dateparser.parse("in 20 minutes")
datetime.datetime(2024, 10, 26, 20, 33, 3, 641682)
>>> dateparser.parse("in twenty minutes")
>>>