WPS Presentation can be run on RISC-V Lichee Pi 4A with the power of box64. And I did profiling for it, in order to find the performance bottleneck. The profiling result is shown below:
So I investigated why getSymbolInSymbolMaps (kh_get_symbolmap & kh_get_symbol2map) costs so much. I found that getSymbolInSymbolMaps calls kh_get 6 times at the worst case, and the most expensive calculation in kh_get is "hashing". Symbols are strings and current string hashing function needs to traversal the whole string.
My idea is to eliminate the duplicate "hashing", since the pass-in symbol name does not change during getSymbolInSymbolMaps.
Implementation
The optimization is straightforward. That is, just calculate the hash once and reuse the result. To achieve this, I added two APIs called kh_hash & kh_get_with_hash, and applied them for getSymbolInSymbolMaps.
Evaluation
I evaluated the start-up time of WPS Presentation. Results are shown here:
where the "logo" column means the time when the logo displays, and the "full" column means the time when the WPS Presentation fully starts up.
So this simple optimization does improve the performance a lot in my use case. And I am sending this PR and sharing the results with you.
Introduction
WPS Presentation can be run on RISC-V Lichee Pi 4A with the power of box64. And I did profiling for it, in order to find the performance bottleneck. The profiling result is shown below:
So I investigated why
getSymbolInSymbolMaps
(kh_get_symbolmap
&kh_get_symbol2map
) costs so much. I found thatgetSymbolInSymbolMaps
callskh_get
6 times at the worst case, and the most expensive calculation inkh_get
is "hashing". Symbols are strings and current string hashing function needs to traversal the whole string.My idea is to eliminate the duplicate "hashing", since the pass-in symbol name does not change during
getSymbolInSymbolMaps
.Implementation
The optimization is straightforward. That is, just calculate the hash once and reuse the result. To achieve this, I added two APIs called
kh_hash
&kh_get_with_hash
, and applied them forgetSymbolInSymbolMaps
.Evaluation
I evaluated the start-up time of WPS Presentation. Results are shown here:
where the "logo" column means the time when the logo displays, and the "full" column means the time when the WPS Presentation fully starts up.
So this simple optimization does improve the performance a lot in my use case. And I am sending this PR and sharing the results with you.