akoumjian / datefinder

Find dates inside text using Python and get back datetime objects
http://datefinder.readthedocs.org/en/latest/
MIT License
635 stars 167 forks source link

IllegalMonthError: bad month number #121

Open rqvolkov opened 4 years ago

rqvolkov commented 4 years ago

python3.7, steps to reproduce:

>>> import datefinder
>>> next(datefinder.find_dates('042 - 18755'))
Traceback (most recent call last):
  File "/home/user/Desktop/projects/.venv/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 655, in parse
    ret = self._build_naive(res, default)
  File "/home/user/Desktop/projects/.venv/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1238, in _build_naive
    if cday > monthrange(cyear, cmonth)[1]:
  File "/usr/lib/python3.7/calendar.py", line 124, in monthrange
    raise IllegalMonthError(month)
calendar.IllegalMonthError: bad month number 18755; must be 1-12

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/Desktop/projects/.venv/lib/python3.7/site-packages/datefinder/__init__.py", line 31, in find_dates
    as_dt = self.parse_date_string(date_string, captures)
  File "/home/user/Desktop/projects/.venv/lib/python3.7/site-packages/datefinder/__init__.py", line 101, in parse_date_string
    as_dt = parser.parse(date_string, default=self.base_date)
  File "/home/user/Desktop/projects/.venv/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 1374, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/home/user/Desktop/projects/.venv/lib/python3.7/site-packages/dateutil/parser/_parser.py", line 657, in parse
    six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
TypeError: unsupported operand type(s) for +: 'int' and 'str'
shelbyt commented 4 years ago

Same error here. Easily reproducible . Try with foo= "139/29 BP " and fails.

VDK commented 4 years ago

Same error here. Easily reproducible. Try with Sept. 13, 1947 Would adding TypeError as an exception in line 103 be a quick fix?

ashishanand7 commented 3 years ago

Same here. Not fixed yet

MichelRobitaille commented 3 years ago

Hi

I started to use with great pleasure datefinder. In many case it's working nicely but sometimes it fails. In 131 cases over 5,018, it failed. In the attached .txt file (please rename it to .ipynb to make it work with Jupyter and run it).

It fails namely with the 1st, the 3d and 4th ones. Only the 2nd one is OK and should be the only one to be parsed and retrieved (datetime.datetime(500, 2, 25, 0, 0), '2,500') (datetime.datetime(2011, 5, 2, 0, 0), '05/02/2011') (datetime.datetime(1975, 2, 25, 0, 0), 'of 75') (datetime.datetime(2021, 5, 25, 0, 0), 'may')

I am not sure but maybe a variable should be reset or something else.

In the end I get a pretty long error message.

Datefinder processing error.txt

edumotya commented 3 years ago

As a workaround, you can monkeypatch the DateFinder class.

import datefinder

class DateFinder(datefinder.DateFinder):
    def parse_date_string(self, date_string, captures):
        try:
            return super().parse_date_string(date_string, captures)
        except TypeError:
            return

datefinder.DateFinder = DateFinder