scrapinghub / dateparser

python parser for human readable dates
BSD 3-Clause "New" or "Revised" License
2.55k stars 465 forks source link

search.search_dates - having hard time with "am" and "next" (with examples) #1037

Open kamaca opened 2 years ago

kamaca commented 2 years ago

Hi all, I've noticed that when I use "am" in search_dates it doesn't work, but when supplied with :minutes it works fine (below is the output of installation and usage with some comments). "pm" seems to work ok. Also having issue when searching for example "next monday" - it returns past monday. Maybe it's just me, any advise is appreciated.

# installing dateparser
(venv) ~/Py/myproject pip install dateparser
Collecting dateparser
  Using cached dateparser-1.1.0-py2.py3-none-any.whl (288 kB)
Collecting pytz
  Using cached pytz-2021.3-py2.py3-none-any.whl (503 kB)
Collecting python-dateutil
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting regex!=2019.02.19,!=2021.8.27
  Using cached regex-2022.1.18-cp39-cp39-macosx_10_9_x86_64.whl (288 kB)
Collecting tzlocal
  Using cached tzlocal-4.1-py3-none-any.whl (19 kB)
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting pytz-deprecation-shim
  Using cached pytz_deprecation_shim-0.1.0.post0-py2.py3-none-any.whl (15 kB)
Collecting tzdata
  Using cached tzdata-2021.5-py2.py3-none-any.whl (339 kB)
Installing collected packages: tzdata, six, pytz-deprecation-shim, tzlocal, regex, pytz, python-dateutil, dateparser
Successfully installed dateparser-1.1.0 python-dateutil-2.8.2 pytz-2021.3 pytz-deprecation-shim-0.1.0.post0 regex-2022.1.18 six-1.16.0 tzdata-2021.5 tzlocal-4.1
(venv) ~/Py/myproject python3
Python 3.9.8 (main, Nov 10 2021, 09:21:22)
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dateparser import search

# - first check am vs pm
>>> search.search_dates("in 3 days at 1pm")
[('in 3 days at 1pm', datetime.datetime(2022, 1, 23, 13, 0))]
>>> search.search_dates("in 3 days at  1am")
[('in 3 days at 1am', datetime.datetime(2022, 1, 23, 16, 44, 27, 62654))]

# - second check 2am vs 2pm vs 2:00am
>>> search.search_dates("tomorrow 2pm")
[('tomorrow 2pm', datetime.datetime(2022, 1, 21, 14, 0))]
>>> search.search_dates("tomorrow 2am")
[('tomorrow 2am', datetime.datetime(2022, 1, 21, 16, 45, 9, 662876))]
>>> search.search_dates("tomorrow 2:00am")
[('tomorrow 2:00am', datetime.datetime(2022, 1, 21, 2, 0))]

# - issue with next (using today for reference)
>>> search.search_dates("today")
[('today', datetime.datetime(2022, 1, 20, 16, 56, 22, 87891))]
>>> search.search_dates("next monday")
[('monday', datetime.datetime(2022, 1, 17, 0, 0))]

Thank you for this amazing module 🙏 it's such a timesaver and sorry if I missed similar issue from earlier.

atharmohammad commented 2 years ago

@kamaca for next monday , maybe you can use settings={'PREFER_DATES_FROM': 'future'}) , this would give you the future monday.

from dateparser.search import search_dates
>>> search_dates("today")
[('today', datetime.datetime(2022, 1, 22, 17, 33, 59, 613985))]
>>> search_dates("next monday",settings={'PREFER_DATES_FROM': 'future'})
[('monday', datetime.datetime(2022, 1, 24, 0, 0))]
kamaca commented 2 years ago

@atharmohammad Thank you for the suggestion🙏. It works great as I will only need future dates 🙌


One additional thing I noticed - when "next + day of the week" is supplied with something like 2am it outputs next month (and again, when using same AM time with :minutes it works fine). This is not a dealbreaker as I will use AM time with minutes, but just in case my output for reference is below:

>>> search.search_dates("next monday 2pm",settings={'PREFER_DATES_FROM': 'future'})
[('monday 2pm', datetime.datetime(2022, 1, 31, 14, 0))]
>>> search.search_dates("next monday 2am",settings={'PREFER_DATES_FROM': 'future'})
[('monday 2am', datetime.datetime(2022, 2, 24, 0, 0))]
>>> search.search_dates("next tuesday 2pm",settings={'PREFER_DATES_FROM': 'future'})
[('tuesday 2pm', datetime.datetime(2022, 1, 25, 14, 0))]
>>> search.search_dates("next tuesday 2am",settings={'PREFER_DATES_FROM': 'future'})
[('tuesday 2am', datetime.datetime(2022, 2, 24, 0, 0))]
>>> search.search_dates("next monday 2:00am",settings={'PREFER_DATES_FROM': 'future'})
[('monday 2:00am', datetime.datetime(2022, 1, 31, 2, 0))]
>>> search.search_dates("next tuesday 2:00am",settings={'PREFER_DATES_FROM': 'future'})
[('tuesday 2:00am', datetime.datetime(2022, 1, 25, 2, 0))]
kamaca commented 8 months ago

Duplicate of https://github.com/scrapinghub/dateparser/issues/1221