freedomlayer / offset

Offset payment engine
https://www.offsetcredit.org
Other
163 stars 20 forks source link

kcov tweaking #213

Open pzmarzly opened 5 years ago

pzmarzly commented 5 years ago

After #207, I tried tweaking kcov settings further on, to understand how they work. I was running kcov on my local machine, using perf stat ./kcov-script.sh.

--{include,exclude}-{path,pattern}= options are used for filtering source files. Per kcov man page, they take comma-separated list of patterns. When no such options are present, all files are used (that includes dependencies downloaded in $HOME/.cargo). Include and exclude are filtering list of files kcov finds in DWARF informations. They kinda work like this (Rust pseudocode):

fn include(files: Vec, text: String) -> Vec {
    files.iter().filter(|x| x.contains(text)).collect()
}
fn exclude(files: Vec, text: String) -> Vec {
    files.iter().filter(|x| !x.contains(text)).collect()
}

{include,exclude}_path are much slower than {include,exclude}_pattern. --include_path=components adds 60 seconds to kcov process on my machine, and it would probably be more on Travis.

Since we use include_pattern=components, only files which path contains components are picked. If any dependency of offst ever happens to have components directory, its test coverage would influence ours. That's why --exclude-path=/usr,~/.cargo may be a good idea. --exclude-pattern=/usr/,/.cargo/ may be an even better one (see paragraph above), as long as we don't create folder called "usr" in offst codebase.

--verify flag makes kcov 3-4 times slower, but it was needed to prevent it from segfaulting when working with Rust binaries. I tried removing it and nothing broke. I'm using the latest version of kcov (CI is not). This may be worth investigating.

kcov runs single-threaded. I tried using GNU parallel like this:

find \
    target/${TARGET}/debug \
    -maxdepth 1 -executable -type f \
    | parallel \
    -j0 \
    'kcov ${FLAGS} target/kcov {}'

-j0 means "spawn as many threads as there are CPUs". This was not a good idea - kcov crashes a lot when multiple processes try to access target/kcov at once.

So, given these pieces of information, I would suggest trying:

for exe in ${exes}; do
    ${HOME}/install/kcov-${TARGET}/bin/kcov \
        --exclude-pattern=/usr/,/.cargo/ \
        --include-pattern=/components/ \
        target/kcov \
        ${exe}
done

Or maybe --exclude-pattern=/.cargo/ in case we ever create usr directory. Or maybe just --exclude-path=/usr --exclude-pattern=/.cargo/, it is costly, but it still should be better than what we have now, especially without --verify.

For now I'm just writing what I know, asking for opinion.

pzmarzly commented 5 years ago
Here is some raw perf data, though I stopped writing these down after a while exclude-path=/usr/lib, include-pattern=components: 502 722 540 131 cycles:u # 2,628 GHz 938 866 219 880 instructions:u # 1,87 insn per cycle 196 347 602 771 branches:u # 1026,580 M/sec 944 884 723 branch-misses:u # 0,48% of all branches 192,142756433 seconds time elapsed 171,348535000 seconds user 19,665885000 seconds sys 508 614 330 007 cycles:u # 2,602 GHz 956 887 911 221 instructions:u # 1,88 insn per cycle 199 661 484 191 branches:u # 1021,454 M/sec 974 111 279 branch-misses:u # 0,49% of all branches 205,933535450 seconds time elapsed 174,700801000 seconds user 20,496185000 seconds sys exclude-pattern=.cargo,/usr/lib,/usr/include, include-pattern=components: 527 363 364 339 cycles:u # 2,665 GHz 1 020 114 023 227 instructions:u # 1,93 insn per cycle 219 241 979 840 branches:u # 1108,083 M/sec 959 441 722 branch-misses:u # 0,44% of all branches 198,525452254 seconds time elapsed 178,927531000 seconds user 18,662000000 seconds sys 521 149 352 588 cycles:u # 2,681 GHz 1 020 128 746 640 instructions:u # 1,96 insn per cycle 219 169 578 086 branches:u # 1127,297 M/sec 954 710 720 branch-misses:u # 0,44% of all branches 195,037822089 seconds time elapsed 175,606380000 seconds user 18,568160000 seconds sys 534 233 029 385 cycles:u # 2,623 GHz 1 039 055 998 492 instructions:u # 1,94 insn per cycle 222 766 246 980 branches:u # 1093,891 M/sec 1 005 201 666 branch-misses:u # 0,45% of all branches 214,076562770 seconds time elapsed 182,133361000 seconds user 21,183972000 seconds sys no exluding, include-pattern=components: 484 364 587 437 cycles:u # 2,631 GHz 896 546 598 743 instructions:u # 1,85 insn per cycle 184 306 478 649 branches:u # 1001,245 M/sec 935 019 075 branch-misses:u # 0,51% of all branches 184,719943556 seconds time elapsed 164,056328000 seconds user 19,781724000 seconds sys 484 506 001 646 cycles:u # 2,608 GHz 896 392 646 798 instructions:u # 1,85 insn per cycle 184 241 117 317 branches:u # 991,917 M/sec 935 686 658 branch-misses:u # 0,51% of all branches 186,310238671 seconds time elapsed 164,075560000 seconds user 21,448010000 seconds sys no exluding, include-pattern=components, no --verify: 125 316 008 058 cycles:u # 2,118 GHz 257 735 773 328 instructions:u # 2,06 insn per cycle 53 752 244 390 branches:u # 908,676 M/sec 243 768 284 branch-misses:u # 0,45% of all branches 60,037292691 seconds time elapsed 43,022770000 seconds user 16,022707000 seconds sys 137 964 867 712 cycles:u # 2,051 GHz 277 232 735 130 instructions:u # 2,01 insn per cycle 57 384 474 140 branches:u # 853,080 M/sec 282 029 913 branch-misses:u # 0,49% of all branches 77,665052384 seconds time elapsed 50,387391000 seconds user 16,689920000 seconds sys 125 757 150 324 cycles:u # 2,035 GHz 257 753 640 643 instructions:u # 2,05 insn per cycle 53 747 234 096 branches:u # 869,874 M/sec 246 906 156 branch-misses:u # 0,46% of all branches 62,577967780 seconds time elapsed 43,139634000 seconds user 18,549244000 seconds sys no exluding, include-path=components, no --verify: 169 819 872 622 cycles:u # 1,568 GHz 329 052 244 303 instructions:u # 1,94 insn per cycle 66 823 697 408 branches:u # 617,162 M/sec 349 166 910 branch-misses:u # 0,52% of all branches 128,141396496 seconds time elapsed 66,956083000 seconds user 41,113681000 seconds sys exlude-path=/usr/lib, include-pattern=components, no --verify: 181 960 949 127 cycles:u # 1,694 GHz 388 474 214 750 instructions:u # 2,13 insn per cycle 85 292 009 590 branches:u # 794,146 M/sec 319 028 745 branch-misses:u # 0,37% of all branches 107,968285312 seconds time elapsed 67,742567000 seconds user 39,503044000 seconds sys no excluding, no including - tested dependencies in .cargo: 844 235 831 447 cycles:u # 2,305 GHz 1 395 705 577 507 instructions:u # 1,65 insn per cycle 271 353 559 049 branches:u # 740,828 M/sec 1 663 067 006 branch-misses:u # 0,61% of all branches 367,885771958 seconds time elapsed 302,566904000 seconds user 63,210475000 seconds sys no excluding, include-path=components: 518 250 495 990 cycles:u # 2,291 GHz 946 301 339 101 instructions:u # 1,83 insn per cycle 193 243 362 302 branches:u # 854,231 M/sec 1 003 924 977 branch-misses:u # 0,52% of all branches 226,966495513 seconds time elapsed 181,909074000 seconds user 44,039785000 seconds sys 513 380 789 124 cycles:u # 2,274 GHz 942 930 097 539 instructions:u # 1,84 insn per cycle 192 717 280 378 branches:u # 853,730 M/sec 980 727 974 branch-misses:u # 0,51% of all branches 226,464357247 seconds time elapsed 179,707696000 seconds user 45,766969000 seconds sys