akoumjian / datefinder

Find dates inside text using Python and get back datetime objects
http://datefinder.readthedocs.org/en/latest/
MIT License
634 stars 166 forks source link

Fixed unreachable branches #200

Open ieviev opened 3 months ago

ieviev commented 3 months ago

I'm very surprised no one else has noticed this in a decade but certain parts of the pattern are fundamentally broken, so i reordered the patterns by descending length.

The problem is that | in PCRE compatible (e.g. nearly all popular engines including python) engines does not really mean or but rather try the next if it fails.

image

This behavior applies to days, months, timezones and extra tokens in the pattern, Reordering the alternations fixes the behavior.

image