akoumjian / datefinder

Find dates inside text using Python and get back datetime objects
http://datefinder.readthedocs.org/en/latest/
MIT License
635 stars 167 forks source link

Wrong date returned for some string continuations. #107

Open gbrova opened 5 years ago

gbrova commented 5 years ago

Example:

str1 = "Deadline is August 10, 2018."
list(datefinder.find_dates(str1))  # [datetime.datetime(2018, 8, 10, 0, 0)]

behaves as expected, but

str2 = "Deadline is August 10, 2018. Reviewing will take about 15 to 20 days."
list(datefinder.find_dates(str2))  # [datetime.datetime(2019, 6, 15, 0, 0)]

seems to get confused about the 15 that occurs ~25 characters later.

I haven't looked too carefully at the implementation, but I wonder if there's an easy hack to fix this by constraining the lookahead window (e.g. no date string can span more than X characters)?