Open zmodem opened 7 months ago
The person who filed this on our side said the time trace suggests the time is spent in ARM Instruction Selection.
@llvm/issue-subscribers-backend-arm
Author: Hans (zmodem)
Also x86 has similar issue. Twiddling counters from many cores in the hot loop will take us (and the processor) to the hell.
For coverage, I suggest using bitmap instead of counters. https://discourse.llvm.org/t/rfc-region-branch-coverage-by-bitmap/79629
Re. single byte counters, I guess it wouldn't work with -fcoverage-mcdc. As you know, it has the performance issue as well. https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
@llvm/issue-subscribers-backend-x86
Author: Hans (zmodem)
Sorry, I misunderstood this were the issue in coverage-instrumented binaries. I've removed X86.
I didn't reproduce this since I cannot set up thumb environment.
Attached is a reproducer from Chromium: formatutilsgl.ii.gz
Without runtime runtime counter relocation it compiles in 14 s:
With runtime counter relocation it takes 8 minutes:
(This uses Clang built at a0b3dbaf4b3c01dc7f0a83fce059a26360b58eb2)