hrydgard / ppsspp

A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
https://www.ppsspp.org
Other
10.98k stars 2.14k forks source link

Compile time reduction #18086

Open fp64 opened 11 months ago

fp64 commented 11 months ago

Platform

None

Compiler and build tool versions

gcc (Debian 10.2.1-6) 10.2.1 20210110

Operating system version

Linux

Build commands used

./b.sh

What happens

PPSSPP probably builds reasonably fast for a project of its size, but I decided to investigate a bit where the time is spent, and here are the results, in case anyone is interested.

For clang there is https://github.com/aras-p/ClangBuildAnalyzer (which I haven't used), but since I'm on GCC, I've cobbled together some data from -H and -ftime-report flags. No idea what tools for this are there for MSVC (which seems to take the longest to compile on CI).

Note: the setup used required single-threaded build (-j1) so timings reflect that.

The time (wall clock, in seconds) spent on a full build is roughly (some smaller ones excluded):

Time variable                                        wall
 phase setup                          :             20.51
 phase parsing                        :           2267.75
 phase lang. deferred                 :            291.61
 phase opt and generate               :            931.60
 phase finalize                       :             25.38
 |name lookup                         :            302.33
 |overload resolution                 :            287.19
 callgraph construction               :             32.88
 callgraph optimization               :             26.07
 callgraph functions expansion        :            724.35
 callgraph ipa passes                 :            229.71
 alias stmt walking                   :             36.66
 preprocessing                        :           1067.02
 parser (global)                      :            358.35
 parser struct body                   :            179.14
 parser function body                 :            106.72
 parser inl. func. body               :             70.41
 parser inl. meth. body               :            150.36
 template instantiation               :            646.69
 constant expression evaluation       :             38.48
 inline parameters                    :             11.11
 tree gimplify                        :             12.71
 tree CFG cleanup                     :             20.02
 tree VRP                             :             27.77
 tree Early VRP                       :             14.37
 dominator optimization               :             32.59
 tree CCP                             :             17.56
 tree FRE                             :             33.25
 tree forward propagate               :             10.35
 expand                               :             18.92
 CSE                                  :             27.55
 CSE 2                                :             16.34
 dead store elim1                     :             15.99
 dead store elim2                     :             10.43
 combiner                             :             27.79
 integrated RA                        :             61.49
 scheduling 2                         :             56.70
 final                                :             17.95
 rest of compilation                  :             30.40
 TOTAL                                :           3029.43

If I'm reading this right (the numbers do not add up, so some things are probably counted under several categories), roughly 2/3 of the time is spent in "phase parsing", 1/3 in "phase opt and generate". The "preprocessing" accounts for about a half of "phase parsing". The "template instantiation" accounts for about 20% of total time (probably also included in some of the above).

The top most included (possibly indirectly) files in PPSSPP are... system headers, unsurprisingly:

   8584 /usr/lib/gcc/i686-linux-gnu/10/include/stddef.h
   5836 /usr/include/i386-linux-gnu/bits/wordsize.h
   3992 /usr/include/i386-linux-gnu/bits/libc-header-start.h
   3341 /usr/include/i386-linux-gnu/bits/mathcalls-narrow.h
   2062 /usr/include/i386-linux-gnu/bits/mathcalls.h
   1780 /usr/include/c++/10/cstdlib
   1747 /usr/include/c++/10/cstring
   1544 /usr/include/c++/10/cwchar
   1452 /usr/lib/gcc/i686-linux-gnu/10/include/stdarg.h
   1431 /usr/include/i386-linux-gnu/bits/long-double.h
   1206 /usr/include/c++/10/cstdio
   1115 /usr/include/c++/10/cerrno
   1036 /usr/include/i386-linux-gnu/bits/mathcalls-helper-functions.h
    976 /usr/include/c++/10/new

The top of its own include headers are (with top being 169th on overall list):

    496 ppsspp/ppsspp_config.h
    450 ppsspp/Common/CommonTypes.h
    445 ppsspp/Common/CommonFuncs.h
    417 ppsspp/Common/Log.h
    336 ppsspp/Common/Common.h
    311 ppsspp/Common/File/Path.h
    283 ppsspp/Common/Swap.h
    255 ppsspp/Core/Opcode.h
    223 ppsspp/Core/MemMap.h
    190 ppsspp/Common/StringUtils.h
    186 ppsspp/Core/MIPS/MIPS.h
    186 ppsspp/Common/Data/Random/Rng.h
    171 ppsspp/Core/ConfigValues.h
    160 ppsspp/Core/CoreParameter.h
    160 ppsspp/Core/Compatibility.h
    158 ppsspp/Core/System.h
    157 ppsspp/Core/Config.h
    143 ppsspp/Common/LogReporting.h
    123 ppsspp/Common/GPU/Shader.h
    123 ppsspp/Common/Data/Collections/FastVec.h
    122 ppsspp/Common/GPU/DataFormat.h
    120 ppsspp/Common/GPU/thin3d.h
    120 ppsspp/Common/GPU/MiscTypes.h
    120 ppsspp/Common/Data/Collections/Slice.h
    116 ppsspp/ext/libzip/zipconf.h
    116 ppsspp/ext/libzip/zip.h
    115 ppsspp/Common/MemoryUtil.h
    112 ppsspp/ext/libzip/zipint.h
    112 ppsspp/ext/libzip/config.h
    112 ppsspp/ext/libzip/compat.h
    107 ppsspp/Common/Serialize/Serializer.h
     97 ppsspp/GPU/ge_constants.h

GCC also suggested some files which could benefit from include guards. Most of them are in its own system headers, most of the rest are in the /ext/ folder, and the rest are (count is how many times it was suggested, not how many times the file is included):

     36 ppsspp/Common/Input/GestureDetector.h
     25 ppsspp/Common/Data/Format/JSONReader.h
     21 ppsspp/Common/Data/Format/JSONWriter.h
     20 ppsspp/Core/MIPS/IR/IRNativeCommon.h
     12 ppsspp/Core/SaveState.h
      5 ppsspp/ffmpeg/linux/x86/include/libavutil/common.h
      4 ppsspp/GPU/GeDisasm.h
      4 ppsspp/Core/WebServer.h
      4 ppsspp/Core/Util/DisArm64.h
      3 ppsspp/UI/CwCheatScreen.h
      2 ppsspp/GPU/Debugger/GECommandTable.h
      2 ppsspp/Core/HW/AsyncIOManager.h
      1 ppsspp/Core/HLE/KernelThreadDebugInterface.h

On a different note, I also tried GCC's -pipe option, and the results were unimpressive (it might have been about 10% faster or so).

Not sure how useful any of this data is, or what specific things it suggests, but maybe helps something.

PPSSPP version affected

v1.12.3-7271-g42741430b

Last working version

No response

Checklist

hrydgard commented 11 months ago

that's a very curious tag you pasted heh, "v1.12.3-7271-g42741430b". like you're missing a lot of tags :)

But yeah, it might be time for another round of include cleanup, but I don't really have a problem with the build time on my current machines, so doesn't feel super urgent.

fp64 commented 11 months ago

Little more info on what actually invokes the includes. "Unique" is how many unique files the header includes (directly/indirectly); "total" is how many #includes it causes during the entire build. Top 65 entries:

    indirect   |      direct    |
  total| unique|   total| unique| file
-------|-------|--------|-------|------------------------------------------------------------------------------------
  13330|    313|     269|     18| ppsspp/ext/armips/ext/filesystem/include/ghc/filesystem.hpp
  13072|    254|      52|      1| ppsspp/ext/armips/Util/FileSystem.h
  13020|    253|      52|      1| ppsspp/ext/armips/ext/filesystem/include/ghc/fs_fwd.hpp
   6991|     67|     423|      4| ppsspp/ext/libzip/zipint.h
   6408|     39|     883|      4| ppsspp/Common/Common.h
   6369|    175|     544|      8| ppsspp/Common/Serialize/Serializer.h
   6347|    258|      74|      2| ppsspp/ext/armips/Core/Types.h
   5907|    255|      24|      1| ppsspp/ext/armips/Util/ByteArray.h
   5678|    216|     843|     11| ppsspp/Common/GPU/thin3d.h
   5128|     27|     821|      4| ppsspp/Common/CommonFuncs.h
   5026|    164|     474|      5| ppsspp/Common/StringUtils.h
   4037|    134|     288|      5| ppsspp/Common/Data/Convert/SmallDataConvert.h
   3926|     47|     297|      4| ppsspp/ext/libzip/zip.h
   3781|    267|      83|      4| ppsspp/ext/armips/Core/Expression.h
   3710|    198|     346|      7| ppsspp/Core/Config.h
   3578|    161|      56|      3| ppsspp/Common/CPUDetect.h
   3547|    248|     175|      7| ppsspp/Common/Render/DrawBuffer.h
   3167|     28|     247|      1| ppsspp/Common/Log.h
   2938|    265|      57|      3| ppsspp/ext/armips/Archs/Architecture.h
   2860|    223|     155|      6| ppsspp/Common/Data/Text/I18n.h
   2776|     95|     131|      2| ppsspp/Common/Math/lin/vec3.h
   2582|     96|      82|      1| ppsspp/Common/Math/lin/matrix4x4.h
   2360|    275|      34|      3| ppsspp/ext/armips/Archs/ARM/Arm.h
   2295|    173|     153|      3| ppsspp/Common/File/FileUtil.h
   2294|    224|      14|      2| ppsspp/ext/SPIRV-Cross/spirv_common.hpp
   2280|    222|      86|      9| ppsspp/ext/SPIRV-Cross/spirv_cross_containers.hpp
   2267|     93|     319|      6| ppsspp/Core/HLE/HLE.h
   2256|    299|      80|      2| ppsspp/ext/armips/Core/Common.h
   2167|     97|      42|      1| ppsspp/Common/System/Display.h
   2098|    286|      98|      4| ppsspp/ext/armips/Core/SymbolTable.h
   2089|    261|      32|      3| ppsspp/ext/armips/Core/ELF/ElfFile.h
   1985|    145|     156|      7| ppsspp/Core/MIPS/IR/IRJit.h
   1963|    267|      87|      6| ppsspp/Core/Debugger/WebSocket/WebSocketUtils.h
   1788|    226|      16|      2| ppsspp/ext/SPIRV-Cross/spirv_cfg.hpp
   1730|    291|     133|     14| ppsspp/Common/GPU/OpenGL/GLRenderManager.h
   1652|     61|     420|      4| ppsspp/Core/MIPS/MIPS.h
   1644|     17|     112|      1| ppsspp/ext/libzip/config.h
   1643|     60|      60|      1| ppsspp/Common/Math/geom2d.h
   1620|    119|      55|      4| ppsspp/Core/MIPS/MIPSVFPUUtils.h
   1613|    156|      28|      3| ppsspp/ext/SPIRV-Cross/spirv_cross_error_handling.hpp
   1584|    194|      50|      4| ppsspp/Common/System/System.h
   1559|    264|      41|      6| ppsspp/Core/MIPS/RiscV/RiscVJit.h
   1558|    266|      20|      2| ppsspp/ext/armips/Core/ELF/ElfRelocator.h
   1553|    228|      21|      3| ppsspp/ext/SPIRV-Cross/spirv_cross.hpp
   1545|    284|     104|      5| ppsspp/ext/armips/Util/FileClasses.h
   1538|     16|     113|      1| ppsspp/ext/libzip/zipconf.h
   1520|     51|     287|      2| ppsspp/Common/Swap.h
   1469|    177|     113|      3| ppsspp/Common/Serialize/SerializeFuncs.h
   1460|    257|     242|     12| ppsspp/Common/UI/View.h
   1436|    115|     199|      6| ppsspp/GPU/Math3D.h
   1394|     69|      28|      2| ppsspp/Core/MIPS/MIPSCodeUtils.h
   1360|    253|      40|      6| ppsspp/Core/MIPS/x86/X64IRJit.h
   1327|    172|      63|      2| ppsspp/Core/Core.h
   1303|     96|      70|      4| ppsspp/Core/MIPS/IR/IRFrontend.h
   1302|     62|     481|      7| ppsspp/Core/MemMap.h
   1285|    100|     192|      3| ppsspp/Common/Math/math_util.h
   1245|    216|      78|      5| ppsspp/Common/Data/Format/IniFile.h
   1107|     23|      86|      2| ppsspp/Core/ConfigValues.h
   1099|    171|     193|      3| ppsspp/Core/System.h
   1097|     49|      84|      2| ppsspp/Common/Data/Random/Rng.h
   1088|    252|      28|      4| ppsspp/Common/UI/UI.h
   1080|    152|      61|      6| ppsspp/Common/UI/Context.h
   1042|    230|      10|      2| ppsspp/ext/SPIRV-Cross/spirv_glsl.hpp
   1012|     74|     121|      7| ppsspp/GPU/Common/DrawEngineCommon.h
    996|    180|      94|      5| ppsspp/Common/Input/InputState.h
hrydgard commented 11 months ago

If you feel up for it, maybe it's possible to hop over to the armips project and see if there's some low hanging fruit there, looks like there might likely be some...

unknownbrackets commented 11 months ago

(which seems to take the longest to compile on CI).

FWIW, this is simply because we don't use ccache or clcache or anything on MSVC, but the others do.

Data/Random/Rng.h is at least easy to get off that list.

-[Unknown]

fp64 commented 11 months ago

maybe it's possible to hop over to the armips project and see if there's some low hanging fruit there

Quoting https://github.com/Kingcom/armips/pull/177:

This uses an implementation of the std::filesystem interface instead of the handwritten functions in various places. I've also added a test to verify the behavior of includes.

std::filesystem is not widely supported on all platforms yet, hence the third party library. However, there's a CMake option to switch to an official std::filesystem implementation. Either implementation is pretty rough on compile times, so an option for precompiled headers is also added.

The option ARMIPS_PRECOMPILE_HEADERS seems off by default.

I've only shown PPSSPP's headers there, but the file that actually induces the most #includes seems to be <string> (maybe it's largely included by cpp files, rather than headers):

    indirect   |      direct    |
  total| unique|   total| unique| file
-------|-------|--------|-------|------------------------------------------------------------------------------------
  46128|    154|    3542|     10| /usr/include/c++/10/string
  25507|     72|    2178|      5| /usr/include/c++/10/bits/basic_string.h
  21097|    121|     832|      4| /usr/include/c++/10/algorithm
  19855|     43|    5800|     10| /usr/include/stdlib.h
  15938|     32|    5512|     11| /usr/include/i386-linux-gnu/sys/types.h
  15507|     49|    1053|      3| /usr/include/c++/10/cstdlib
  15325|     91|     705|      6| /usr/include/c++/10/cmath
  13529|     38|    2052|      4| /usr/include/c++/10/ext/string_conversions.h
  13330|    313|     269|     18| ppsspp/ext/armips/ext/filesystem/include/ghc/filesystem.hpp
  13072|    254|      52|      1| ppsspp/ext/armips/Util/FileSystem.h
  13020|    253|      52|      1| ppsspp/ext/armips/ext/filesystem/include/ghc/fs_fwd.hpp
  11020|    177|     534|      3| /usr/include/c++/10/istream
  10908|    173|     791|      7| /usr/include/c++/10/ios
  10880|     56|    1725|      5| /usr/include/c++/10/bits/stl_algo.h
  10308|     33|    1011|      2| /usr/include/c++/10/ext/atomicity.h
  10210|    171|    1018|      6| /usr/include/c++/10/mutex
   9701|     33|     514|      1| /usr/include/i386-linux-gnu/c++/10/bits/gthr.h
   9562|    118|    1941|     10| /usr/include/c++/10/functional
   9553|     44|    1252|      4| /usr/include/c++/10/bits/char_traits.h
   9187|     32|     514|      1| /usr/include/i386-linux-gnu/c++/10/bits/gthr-default.h
   9023|    170|     582|      5| /usr/include/c++/10/fstream
   8948|     16|     878|      2| /usr/lib/gcc/i686-linux-gnu/10/include/limits.h
   8891|     36|    2179|      6| /usr/include/pthread.h
   8809|     19|    8330|     12| /usr/include/math.h
   8218|     23|    7344|     13| /usr/include/stdio.h
   7799|     26|    4884|     11| /usr/include/c++/10/bits/stl_algobase.h
   7461|    120|     822|      3| /usr/include/c++/10/system_error
   6991|     67|     423|      4| ppsspp/ext/libzip/zipint.h
   6459|     28|     558|      2| /usr/include/c++/10/cstdio
   6429|     35|     444|      2| /usr/include/c++/10/pstl/glue_algorithm_defs.h
   6408|     39|     883|      4| ppsspp/Common/Common.h
   6369|    175|     544|      8| ppsspp/Common/Serialize/Serializer.h