The original code reallocates memory for string multiple times, just reuse Qt's case folding and do everything with QString, only returns std::u32string.
It is simply 10x faster.
New version is apply2
In debug build, the speed up is 10x
/Users/slbtty/src/goldendict-ng/cmake-build-debug/bcf
Unable to determine clock rate from sysctl: hw.cpufrequency: No such file or directory
This does not affect benchmark measurements, only the metadata output.
***WARNING*** Failed to set thread affinity. Estimated CPU frequency may be incorrect.
2024-11-12T17:56:24-05:00
Running /Users/slbtty/src/goldendict-ng/cmake-build-debug/bcf
Run on (8 X 24 MHz CPU s)
CPU Caches:
L1 Data 64 KiB
L1 Instruction 128 KiB
L2 Unified 4096 KiB (x8)
Load Average: 3.55, 3.72, 3.48
--------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------
applyFolding 45889 ns 45767 ns 15304
applyFolding2 3609 ns 3602 ns 194045
related https://github.com/xiaoyifang/goldendict-ng/issues/1943
The original code reallocates memory for string multiple times, just reuse Qt's case folding and do everything with QString, only returns std::u32string.
It is simply 10x faster.
New version is
apply2
In debug build, the speed up is 10x
For -O2 or similar, the factor is less than 2 https://github.com/xiaoyifang/goldendict-ng/actions/runs/11807278152/job/32893638497#step:7:15