nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.64k stars 1.47k forks source link

Nim hangs forever in infinite loop in nre library #5444

Closed genotrance closed 7 years ago

genotrance commented 7 years ago

Short code example that hangs Nim:-

import httpclient
import nre

var http = newHttpClient()
var data = http.getContent("http://downloads.tomsguide.com/MusicBee,0301-27475.html")

echo data.findall(re"url.*? = '(.*?)'")

On pressing CTRL-C, following stack is printed:-

Traceback (most recent call last)
http.nim(7)              http
nre.nim(521)             findAll
nre.nim(452)             matchImpl
SIGINT: Interrupted by Ctrl-C.
SIGINT: Interrupted by Ctrl-C.

Replace URL with google.com and it works fine, try cnn.com, it hangs again. Not sure what about the content is confusing Nim or PCRE.

napalu commented 7 years ago

The problem is in nre.findIter which keeps on searching when it's never matched until it reaches the end of the input. The function keeps incrementing the offset by one and searches again until len of input is reached. This can take a long time when the input is large and can give the impression that the function hangs. The fix is simple: bail early when never matched instead of searching for the next match boundary.

genotrance commented 7 years ago

I just tried it out your change and it works great - in fact it probably greatly speeds up the search performance since searching in large files and strings was going through each offset.

Hope it gets accepted into the main branch soon.