c3lang / c3c

Compiler for the C3 language
https://c3-lang.org
GNU Lesser General Public License v3.0
2.84k stars 174 forks source link

Use LLVM Polly for better optimizations #384

Open data-man opened 2 years ago

data-man commented 2 years ago

LLVM opt has additional Polly options:

Polly is a high-level loop and data-locality optimizer and optimization infrastructure for LLVM. It uses an abstract mathematical representation based on integer polyhedra to analyze and optimize the memory access pattern of a program. We currently perform classical loop transformations, especially tiling and loop fusion to improve data-locality. Polly can also exploit OpenMP level parallelism, expose SIMDization opportunities. Work has also be done in the area of automatic GPU code generation.

Polly Options:
Configure the polly loop optimizer

  --polly                                                               - Enable the polly optimizer (with -O1, -O2 or -O3)
  --polly-2nd-level-tiling                                              - Enable a 2nd level loop of loop tiling
  --polly-ast-print-accesses                                            - Print memory access functions
  --polly-context=<isl parameter set>                                   - Provide additional constraints on the context parameters
  --polly-dce-precise-steps=<int>                                       - The number of precise steps between two approximating iterations. (A value of -1 schedules another approximation stage before the actual dead code elimination.
  --polly-delicm-max-ops=<int>                                          - Maximum number of isl operations to invest for lifetime analysis; 0=no limit
  --polly-detect-full-functions                                         - Allow the detection of full functions
  --polly-dump-after                                                    - Dump module after Polly transformations into a file suffixed with "-after"
  --polly-dump-after-file=<string>                                      - Dump module after Polly transformations to the given file
  --polly-dump-before                                                   - Dump module before Polly transformations into a file suffixed with "-before"
  --polly-dump-before-file=<string>                                     - Dump module before Polly transformations to the given file
  --polly-enable-simplify                                               - Simplify SCoP after optimizations
  --polly-ignore-func=<string>                                          - Ignore functions that match a regex. Multiple regexes can be comma separated. Scop detection will ignore all functions that match ANY of the regexes provided.
  --polly-isl-arg=<argument>                                            - Option passed to ISL
  --polly-on-isl-error-abort                                            - Abort if an isl error is encountered
  --polly-only-func=<string>                                            - Only run on functions that match a regex. Multiple regexes can be comma separated. Scop detection will run on all functions that match ANY of the regexes provided.
  --polly-only-region=<identifier>                                      - Only run on certain regions (The provided identifier must appear in the name of the region's entry block
  --polly-only-scop-detection                                           - Only run scop detection, but no other optimizations
  --polly-optimized-scops                                               - Polly - Dump polyhedral description of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations is applied on the schedule tree
  --polly-parallel                                                      - Generate thread parallel code (isl codegen only)
  --polly-parallel-force                                                - Force generation of thread parallel code ignoring any cost model
  --polly-pattern-matching-based-opts                                   - Perform optimizations based on pattern matching
  --polly-postopts                                                      - Apply post-rescheduling optimizations such as tiling (requires -polly-reschedule)
  --polly-pragma-based-opts                                             - Apply user-directed transformation from metadata
  --polly-pragma-ignore-depcheck                                        - Skip the dependency check for pragma-based transformations
  --polly-process-unprofitable                                          - Process scops that are unlikely to benefit from Polly optimizations.
  --polly-register-tiling                                               - Enable register tiling
  --polly-report                                                        - Print information about the activities of Polly
  --polly-reschedule                                                    - Optimize SCoPs using ISL
  --polly-show                                                          - Highlight the code regions that will be optimized in a (CFG BBs and LLVM-IR instructions)
  --polly-show-only                                                     - Highlight the code regions that will be optimized in a (CFG only BBs)
  --polly-stmt-granularity=<value>                                      - Algorithm to use for splitting basic blocks into multiple statements
    =bb                                                                 -   One statement per basic block
    =scalar-indep                                                       -   Scalar independence heuristic
    =store                                                              -   Store-level granularity
  --polly-target=<value>                                                - The hardware to target
    =cpu                                                                -   generate CPU code
  --polly-tiling                                                        - Enable loop tiling
  --polly-vectorizer=<value>                                            - Select the vectorization strategy
    =none                                                               -   No Vectorization
    =polly                                                              -   Polly internal vectorizer
    =stripmine                                                          -   Strip-mine outer loops for the loop-vectorizer to trigger

It would be great if c3c would also use Polly.

lerno commented 2 years ago

Polly is pretty new (comparatively speaking) and the LTO/ThinLTO part isn't integrated yet, nor is lld currently correctly working well on all CI targets. I would prefer to not include polly until it's actually used in the compiler.

data-man commented 2 years ago

Polly is pretty new

Hmm?

commit 758053788bde4747953f5f276ded345cd01323b1 Author: Tobias Grosser grosser@fim.uni-passau.de Date: Fri Apr 29 06:27:02 2011 +0000

Add initial version of Polly

This version is equivalent to commit ba26ebece8f5be84e9bd6315611d412af797147e in the old git repository.

llvm-svn: 130476

lerno commented 2 years ago

Yeah, it's still not part of the main LLVM libraries I believe, and not all benchmarks necessarily show improvements, although some do.

data-man commented 2 years ago

clang has -mllvm <value> option. It would be nice to have it in c3c. :)

lerno commented 2 years ago

Unfortunately -mllvm is implemented by Clang, so all that functionality would need to be implemented by hand if added.

lerno commented 2 months ago

Are there any benchmarks on Polly?