zeusdeux / re2

Automatically exported from code.google.com/p/re2
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

matching one some regexps in multiple threads does not gain performance #93

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
See demonstration program. Require https://github.com/axiak/pyre2, but may be 
rewritten in C.

Stdout:
Testing asyncmatch with 50000 items
Compiling compelte
Singlematch Timing in progress...
Singlematch Real 13.4749009609 Cpu 13.456841 Effectiveness 0.998659733309
Multimatch Timing in progress...
Multimatch Real 18.728771925 Cpu 35.642228 Effectiveness 1.90307341788 <======= 
Mean that TWO threads active at a time
Testing asyncmatch with 10000 items
Compiling compelte
Singlematch Timing in progress...
Singlematch Real 1.32520699501 Cpu 1.324082 Effectiveness 0.999151079783
Multimatch Timing in progress...
Multimatch Real 2.1003010273 Cpu 2.100132 Effectiveness 0.999919522347   
<======= Mean that one thread active at a time

And stderr (another issue):
re2/dfa.cc:1154: DFA memory cache could be too small: only room for 275 states.

Debugging show that interlocking occur in RunStateOnByteUnlocked(). Why ? 
everything is read-only (!) maybe replace MutexLock with ReaderMutexLock ?

Original issue reported on code.google.com by socketp...@gmail.com on 21 Oct 2013 at 8:46

Attachments:

GoogleCodeExporter commented 9 years ago
You are using an enormous regexp, and the DFA cache is too small for that. The 
threads are contending over the cache. The cache is not read-only. If you give 
the DFA more memory it should run better. The print on standard error is 
telling you this. See the set_max_mem option.

Original comment by rsc@golang.org on 10 Jan 2014 at 3:04

GoogleCodeExporter commented 9 years ago
Okay, why results are better on much more bigger input so?

Original comment by socketp...@gmail.com on 10 Jan 2014 at 3:46

GoogleCodeExporter commented 9 years ago
set_max_mem set to 400 mb in my example

why 400 mb is not enough for that job?

not so big regexp imho....

Original comment by socketp...@gmail.com on 10 Jan 2014 at 5:59