sanskrit-lexicon / ApteES

Apte English-Sanskrit Dictionary
5 stars 2 forks source link

Flaw in Python re.split #10

Open funderburkjim opened 4 months ago

funderburkjim commented 4 months ago

During the work under #9, a flaw was discovered in the re.split() function of Python. A solution was found by a conversation with Copilot on Windows 11. See splitproblem for some details. The function 'split1' in test1.py provides what looks like a quite general replacement for re.split(regex,text,re.DOTALL) using re.findIter().

To see the problem,

python test1.py 5 # no problem for any number from 1 through 16.
python test1.py 17 # shows the problem.  Problem occurs for inputs > 16.

The code was run with Python version 3.9.1.

I hope Python will find some way of identifying (perhaps by an exception?) when re.split(regex,text,re.DOTALL) gives THE WRONG ANSWER.

AFAIK: The problem (wrong answer) occurs when