gabime / spdlog

Fast C++ logging library.
Other
24.03k stars 4.5k forks source link

Is there any way to accelerate the compilation? #3122

Open zhuzhzh opened 2 months ago

zhuzhzh commented 2 months ago
% time make
g++ -g example.cpp -I/home/public/spdlog/include -L/home/public/spdlog/lib64 -lspdlog -DSPDLOG_COMPILED_LIB -I/home/public/spdlog/include -lpthread -o logger
make  3.43s user 0.26s system 97% cpu 3.782 total

the example.cpp is the built-in example.

I precompile the spdlog intot the shared lib.

Here is the profile.

% g++ -g example.cpp -I/home/public/spdlog/include -L/home/public/spdlog/lib64 -lspdlog -DSPDLOG_COMPILED_LIB - -I/home/public/spdlog/include -lpthread -o logger -ftime-report

Time variable                                   usr           sys          wall           GGC
 phase setup                        :   0.00 (  0%)   0.01 (  0%)   0.00 (  0%)  1562k (  0%)
 phase parsing                      :   1.13 ( 26%)   1.06 ( 52%)   2.20 ( 34%)   160M ( 40%)
 phase lang. deferred               :   0.80 ( 18%)   0.35 ( 17%)   1.15 ( 18%)    85M ( 21%)
 phase opt and generate             :   2.28 ( 53%)   0.62 ( 30%)   2.90 ( 45%)   150M ( 38%)
 phase last asm                     :   0.12 (  3%)   0.00 (  0%)   0.13 (  2%)  1725k (  0%)
 phase finalize                     :   0.01 (  0%)   0.01 (  0%)   0.00 (  0%)     0  (  0%)
 |name lookup                       :   0.35 (  8%)   0.26 ( 13%)   0.65 ( 10%)  6882k (  2%)
 |overload resolution               :   0.52 ( 12%)   0.24 ( 12%)   0.88 ( 14%)    59M ( 15%)
 garbage collection                 :   0.31 (  7%)   0.00 (  0%)   0.30 (  5%)     0  (  0%)
 dump files                         :   0.15 (  3%)   0.07 (  3%)   0.22 (  3%)     0  (  0%)
 callgraph construction             :   0.19 (  4%)   0.02 (  1%)   0.22 (  3%)    23M (  6%)
 callgraph optimization             :   0.04 (  1%)   0.05 (  2%)   0.05 (  1%)  4728  (  0%)
 callgraph ipa passes               :   0.16 (  4%)   0.20 ( 10%)   0.35 (  5%)    12M (  3%)
 ipa function summary               :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)   242k (  0%)
 ipa inheritance graph              :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)    14k (  0%)
 ipa inlining heuristics            :   0.02 (  0%)   0.00 (  0%)   0.05 (  1%)   384  (  0%)
 ipa pure const                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 cfg construction                   :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)   598k (  0%)
 cfg cleanup                        :   0.03 (  1%)   0.00 (  0%)   0.02 (  0%)    20k (  0%)
 trivially dead code                :   0.01 (  0%)   0.01 (  0%)   0.02 (  0%)     0  (  0%)
 df scan insns                      :   0.09 (  2%)   0.01 (  0%)   0.15 (  2%)   114k (  0%)
 df live regs                       :   0.04 (  1%)   0.01 (  0%)   0.05 (  1%)     0  (  0%)
 df reg dead/unused notes           :   0.03 (  1%)   0.02 (  1%)   0.03 (  0%)  1178k (  0%)
 register information               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 alias analysis                     :   0.03 (  1%)   0.00 (  0%)   0.00 (  0%)   517k (  0%)
 register scan                      :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)  1280  (  0%)
 rebuild jump labels                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)   240  (  0%)
 preprocessing                      :   0.09 (  2%)   0.20 ( 10%)   0.39 (  6%)  3079k (  1%)
 parser (global)                    :   0.19 (  4%)   0.30 ( 15%)   0.37 (  6%)    31M (  8%)
 parser struct body                 :   0.07 (  2%)   0.08 (  4%)   0.20 (  3%)    20M (  5%)
 parser enumerator list             :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)   268k (  0%)
 parser function body               :   0.08 (  2%)   0.09 (  4%)   0.20 (  3%)  5444k (  1%)
 parser inl. func. body             :   0.05 (  1%)   0.04 (  2%)   0.06 (  1%)  6614k (  2%)
 parser inl. meth. body             :   0.20 (  5%)   0.10 (  5%)   0.18 (  3%)    13M (  3%)
 template instantiation             :   1.15 ( 26%)   0.54 ( 26%)   1.69 ( 26%)   132M ( 33%)
 constant expression evaluation     :   0.02 (  0%)   0.03 (  1%)   0.07 (  1%)   984k (  0%)
 inline parameters                  :   0.03 (  1%)   0.05 (  2%)   0.00 (  0%)   997k (  0%)
 integration                        :   0.00 (  0%)   0.02 (  1%)   0.04 (  1%)  1699k (  0%)
 tree gimplify                      :   0.04 (  1%)   0.03 (  1%)   0.06 (  1%)  9578k (  2%)
 tree eh                            :   0.02 (  0%)   0.01 (  0%)   0.02 (  0%)  2360k (  1%)
 tree CFG construction              :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)  4430k (  1%)
 tree CFG cleanup                   :   0.04 (  1%)   0.01 (  0%)   0.04 (  1%) 10072  (  0%)
 tree PHI insertion                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)   490k (  0%)
 tree SSA rewrite                   :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)  2066k (  1%)
 tree SSA other                     :   0.01 (  0%)   0.02 (  1%)   0.06 (  1%)   379k (  0%)
 tree SSA incremental               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    32k (  0%)
 tree operand scan                  :   0.00 (  0%)   0.04 (  2%)   0.03 (  0%)  4396k (  1%)
 tree FRE                           :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)   170k (  0%)
 tree forward propagate             :   0.00 (  0%)   0.01 (  0%)   0.00 (  0%)  5864  (  0%)
 PHI merge                          :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)  2016  (  0%)
 dominance frontiers                :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)     0  (  0%)
 dominance computation              :   0.02 (  0%)   0.03 (  1%)   0.02 (  0%)     0  (  0%)
 out of ssa                         :   0.04 (  1%)   0.00 (  0%)   0.00 (  0%)   231k (  0%)
 expand vars                        :   0.01 (  0%)   0.01 (  0%)   0.01 (  0%)   968k (  0%)
 expand                             :   0.12 (  3%)   0.02 (  1%)   0.11 (  2%)    13M (  3%)
 post expand cleanups               :   0.01 (  0%)   0.01 (  0%)   0.03 (  0%)  1490k (  0%)
 varconst                           :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)  7032  (  0%)
 jump                               :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)     0  (  0%)
 forward prop                       :   0.00 (  0%)   0.00 (  0%)   0.02 (  0%)  2632  (  0%)
 CSE                                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)  7744  (  0%)
 loop init                          :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)  2188k (  1%)
 loop fini                          :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 branch prediction                  :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    62k (  0%)
 combiner                           :   0.01 (  0%)   0.01 (  0%)   0.01 (  0%)   109k (  0%)
 integrated RA                      :   0.24 (  6%)   0.05 (  2%)   0.40 (  6%)    59M ( 15%)
 LRA non-specific                   :   0.10 (  2%)   0.04 (  2%)   0.15 (  2%)   519k (  0%)
 LRA virtuals elimination           :   0.01 (  0%)   0.00 (  0%)   0.03 (  0%)  1179k (  0%)
 LRA reload inheritance             :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)  4368  (  0%)
 LRA create live ranges             :   0.03 (  1%)   0.00 (  0%)   0.04 (  1%)    15k (  0%)
 reload                             :   0.02 (  0%)   0.00 (  0%)   0.00 (  0%)    57k (  0%)
 thread pro- & epilogue             :   0.06 (  1%)   0.00 (  0%)   0.04 (  1%)  4127k (  1%)
 shorten branches                   :   0.06 (  1%)   0.00 (  0%)   0.03 (  0%)   240  (  0%)
 reg stack                          :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)  3504  (  0%)
 final                              :   0.08 (  2%)   0.03 (  1%)   0.14 (  2%)  5623k (  1%)
 symout                             :   0.22 (  5%)   0.04 (  2%)   0.32 (  5%)    36M (  9%)
 initialize rtl                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)    12k (  0%)
 early local passes                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 rest of compilation                :   0.21 (  5%)   0.03 (  1%)   0.26 (  4%)  6528k (  2%)
 unaccounted late compilation       :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 repair loop structures             :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)     0  (  0%)
 TOTAL                              :   4.34          2.05          6.38          399M
tt4g commented 2 months ago

Using the latest version of the fmt library instead of bundled fmt library in spdlog may reduce the compile time.

zhuzhzh commented 2 months ago

I used the external latest fmt library. but I didn't find any improvement.

the first line in example.cpp:

define SPDLOG_FMT_EXTERNAL

I commented out the user_defined_example() which leads to the compilation error.

Then I recompile the example.

g++ -g example.cpp -I/home/public/fmt/include -L/home/public/fmt/lib64 -lfmt -I/home/public/spdlog/include -L/home/public/spdlog/lib64 -lspdlog -DSPDLOG_COMPILED_LIB    -I/home/public/spdlog/include -lpthread -o logger
make  3.16s user 0.18s system 97% cpu 3.451 total
tt4g commented 2 months ago

Then, compile time may not be reduced any further.