I appreciate sharing your code. It does what it is designed for. But sometimes while using DateFinder I faced Regex catastrophic backtracking problem. Especially while parsing "tables" in plain text like this: codepile sample.
The root of the problem was in RANGE_REGEX. I have replaced this logic by simply splitting the source text by "to" / "through" keywords. I've also simplified the main (DATE_REGEX) regex a bit.
The plain text I referenced above took about 48s to be parsed. Now, after this code update, this piece of text takes only 0.007s.
Could you consider applying my changes or, at least, changing the logic of splitting date ranges (RANGE_REGEX)?
Dear Alec,
I appreciate sharing your code. It does what it is designed for. But sometimes while using DateFinder I faced Regex catastrophic backtracking problem. Especially while parsing "tables" in plain text like this: codepile sample.
The root of the problem was in RANGE_REGEX. I have replaced this logic by simply splitting the source text by "to" / "through" keywords. I've also simplified the main (DATE_REGEX) regex a bit.
The plain text I referenced above took about 48s to be parsed. Now, after this code update, this piece of text takes only 0.007s.
Could you consider applying my changes or, at least, changing the logic of splitting date ranges (RANGE_REGEX)?
Thank you in advance!