Dushistov / sdcv

https://dushistov.github.io/sdcv/
GNU General Public License v2.0
294 stars 42 forks source link

Severe performance degradation #31

Closed Frenzie closed 3 years ago

Frenzie commented 6 years ago

Commit 3105823e8bd66b3276bca41c2975e8756da478c5 seems to be responsible for severe performance degradation. See https://github.com/koreader/koreader/pull/3378#issuecomment-338391694 for some quick & dirty benchmarks.

Also pinging @ecraven.

Edit: bisection log attached. sdcv-benchmark.log

Dushistov commented 6 years ago

@Frenzie I can not see any remarkable difference:

master.log prev.log

prev.log is for edf73656aaaad46f675b5dd8ddb27ce89d774b3e parent of 3105823 .

I compare on linux 4.13.7/amd64/gcc 7.2.0 and of course I use Release mode when build sdcv:

épuisant         épuisant               
real             0m0,212s         real  0m0,209s
user             0m0,196s         user  0m0,202s
sys              0m0,017s         sys   0m0,007s
épuisante        épuisante              
real             0m0,209s         real  0m0,203s
user             0m0,206s         user  0m0,192s
sys              0m0,003s         sys   0m0,010s
épuisantes       épuisantes             
real             0m0,212s         real  0m0,209s
user             0m0,198s         user  0m0,182s
sys              0m0,013s         sys   0m0,027s
pleurésie        pleurésie              
real             0m0,212s         real  0m0,207s
user             0m0,195s         user  0m0,177s
sys              0m0,017s         sys   0m0,030s
caparaçonnée     caparaçonnée           
real             0m0,190s         real  0m0,190s
user             0m0,173s         user  0m0,180s
sys              0m0,017s         sys   0m0,010s
pre-consonantal  pre-consonantal        
real             0m0,159s         real  0m0,158s
user             0m0,142s         user  0m0,144s
sys              0m0,016s         sys   0m0,014s
peche            peche                  
real             0m0,021s         real  0m0,021s
user             0m0,014s         user  0m0,014s
sys              0m0,007s         sys   0m0,007s
resse            resse                  
real             0m0,134s         real  0m0,135s
user             0m0,117s         user  0m0,128s
sys              0m0,017s         sys   0m0,007s
pécheresse       pécheresse             
real             0m0,203s         real  0m0,208s
user             0m0,196s         user  0m0,191s
sys              0m0,007s         sys   0m0,017s
Frenzie commented 6 years ago

I compare on linux 4.13.7/amd64/gcc 7.2.0 and of course I use Release mode when build sdcv:

Both look a bit slow to me, but I'll see if I can isolate a dictionary that shows strong differences.

of course I use Release mode when build sdcv

Could you clarify what you mean by that? I would normally assume it's debug that requires special flags. It's possible that the x86_64 build I uploaded on GitHub is debug-enabled, but to exclude anything related to the specific way I built it I explicitly tested a regular mkdir build && cmake /src/location && make before reporting anything.

Frenzie commented 6 years ago

I see this with every dictionary I try. (Now on gcc 7.2.0 on Xubuntu 17.10 instead of Debian Stretch.) It's stronger on larger dictionaries like xmllittre but it shows just as clearly on a smaller dictionary like this one.

Are you sure you rebuilt it properly?

edf73656aaaad46f675b5dd8ddb27ce89d774b3e.log master.log

ecraven commented 6 years ago

What do you compare exactly? Does the upstream version include synonym support? Depending on the number of synonyms in your dictionary, this might be noticable..

Frenzie commented 6 years ago

Yes, it's specifically post-synonym that this happens. That's how bisecting works. You test if your suspicion (synonyms) is correct and when it's not you move on. :-P

PS This performance degradation wasn't present in the patch that I submitted in May of 2016. I haven't examined the differences between that and what got merged.

PPS What do you mean "upstream version"? This is upstream. ;-)

ecraven commented 6 years ago

IIRC I rewrote all code, so the details might well differ. Synonyms of course slow things down a bit, if there are many, so I'm not sure what the exact problem is? From looking at the exact commit you posted above, it just moves code around a bit (and adds new things for json), but there should be no actual difference in control flow.

Frenzie commented 6 years ago

Forget about the synonyms. That was just an initial hypothesis which didn't pan out. Bisection clearly shows this commit as the culprit.

Frenzie commented 6 years ago

I mean that "forget out" jovially. Just looking at it written it seems a bit harsh. :-)

Anyway, what I found was that the performance impact of synonyms is basically as would be expected. It's there. It's nothing bad.

Frenzie commented 6 years ago

I must apologize. Apparently my bisection method was wrong. When you git checkout some-hash you can be on side branches. I'll have to look into how to do that the way I expected it to work.

It may be that it's all synonyms then. But the difference is huge. This is on my i7 4790, no slouch.

$ for w in `echo "épuisant épuisante épuisantes pleurésie caparaçonnée pre-consonantal pecheresse pécheresse"`; do echo $w ; time ./sdcv --utf8-input --data-dir data/dict -n $w 1>/dev/null; done
épuisant

real    0m0.081s    real    0m0.404s
user    0m0.080s    user    0m0.396s
sys 0m0.000s    sys 0m0.004s
épuisante

real    0m0.082s    real    0m0.470s
user    0m0.076s    user    0m0.456s
sys 0m0.000s    sys 0m0.012s
Dushistov commented 6 years ago

@Frenzie

Could you clarify what you mean by that? I would normally assume it's debug that requires special flags. It's possible that the x86_64 build I uploaded on GitHub is debug-enabled, but to exclude anything related to the specific way I built it I explicitly tested a regular mkdir build && cmake /src/location && make before reporting anything.

cmake without arguments use None build type, so compiler not get any additional arguments, and by default gcc and clang without argument build code without any optimiztion. You can run make VERBOSE=1 to see with which arguments cc and c++ are invoked. So to build with release mode you need: cmake -DCMAKE_BUILD_TYPE=Release path or RelWithDebInfo.

Frenzie commented 6 years ago

@Dushistov Thanks, I saw it in the CMakeCache.txt file. It looks like sdcv performs ever so slightly better with -O3 (what -DCMAKE_BUILD_TYPE=Release apparently does) than with -O2 (KOReader release flags), both of which are ~2x faster than -O0.

I can investigate if there's a sensible way to selectively apply -O3 for its minor performance gains if the same applies on ARMv7 (of course arch-dependent flags are always an option), perhaps by simply using this CMake flag dynamically, though that is entirely my "problem" and separate from this issue. :-)

Since @chrox thought the performance trade-off for fuzzy search was so bad that it warranted the introduction of #26, the simple solution could be to do the same for synonyms. The performance impact seems to be a lot larger, though like I said I'll have to bisect more properly first. Or perhaps I'll just selectively revert a couple of commits. Mea culpa.

I haven't found the time yet to investigate the code itself, but since synonyms seem to be basically just a list of some other entry points to the same definition it's unclear to me why it should be orders of magnitude slower rather than just a few times. Could fuzzy search be involved in a potentially illogical way?

rtega commented 6 years ago

Has any progress been made on this issue yet?