arrow-py / arrow

🏹 Better dates & times for Python
https://arrow.readthedocs.io
Apache License 2.0
8.71k stars 673 forks source link

Weird exception when using arrow.get on malformed string #854

Closed Google-Autofuzz closed 3 years ago

Google-Autofuzz commented 4 years ago

I'm getting an exception from re when running the following script on the attached output:

import arrow
import sys

with open(sys.argv[1]) as f:
    data = f.read()

fmtstr, payload = str(data[:24]), str(data[24:])
arrow.get(payload, fmtstr).humanize()

clusterfuzz-testcase-minimized-fuzz_get-5739607865688064.txt

$ cat clusterfuzz-testcase-minimized-fuzz_get-5739607865688064 
struct n[X+,N-M)MMXdMM]<(ven) 
$ python test.py clusterfuzz-testcase-minimized-fuzz_get-5739607865688064 
Traceback (most recent call last):
  File "test.py", line 8, in <module>
    arrow.get(payload, fmtstr).humanize()
  File "/home/user/ven/lib/python3.8/site-packages/arrow/api.py", line 21, in get
    return _factory.get(*args, **kwargs)
  File "/home/user/ven/lib/python3.8/site-packages/arrow/factory.py", line 246, in get
    dt = parser.DateTimeParser(locale).parse(
  File "/home/user/ven/lib/python3.8/site-packages/arrow/parser.py", line 227, in parse
    fmt_tokens, fmt_pattern_re = self._generate_pattern_re(fmt)
  File "/home/user/ven/lib/python3.8/site-packages/arrow/parser.py", line 324, in _generate_pattern_re
    return tokens, re.compile(bounded_fmt_pattern, flags=re.IGNORECASE)
  File "/usr/lib/python3.8/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.8/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.8/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.8/sre_parse.py", line 962, in parse
    raise source.error("unbalanced parenthesis")
re.error: unbalanced parenthesis at position 85
$

I would be nice to catch it, and to raise an arrow-related exception instead.

System Info

jadchaar commented 4 years ago

Thanks for reporting this. As part of a fix, we should probably do the following: