Open jwaldmann opened 2 years ago
The top functions (nearly 50 percent of all time!) are
640 bit_cut_integer :: Integer -> (Integer, Bool);
641 bit_cut_integer k =
642 (if k == (0 :: Integer) then ((0 :: Integer), False)
643 else (case divMod (abs k) (abs (2 :: Integer)) of {
644 (r, s) ->
645 ((if (0 :: Integer) < k then r else negate r - s),
646 s == (1 :: Integer));
647 }));
649 data Char = Char Bool Bool Bool Bool Bool Bool Bool Bool;
650
651 char_of_integer :: Integer -> Char;
652 char_of_integer k =
653 (case bit_cut_integer k of {
654 (q0, b0) ->
655 (case bit_cut_integer q0 of {
656 (q1, b1) ->
657 (case bit_cut_integer q1 of {
658 (q2, b2) ->
659 (case bit_cut_integer q2 of {
660 (q3, b3) ->
661 (case bit_cut_integer q3 of {
662 (q4, b4) ->
663 (case bit_cut_integer q4 of {
664 (q5, b5) ->
665 (case bit_cut_integer q5 of {
666 (q6, b6) ->
667 let {
668 a = bit_cut_integer q6;
669 } in (case a of {
670 (_, aa) -> Char b0 b1 b2 b3 b4 b5 b6 aa;
671 });
672 });
673 });
674 });
675 });
676 });
677 });
678 });
679
certainly we can do better here?
similar numbers for a proof with semantic labeling (data/53216.*)
bit_cut_integer Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(641,1)-(647,12) 28.2 15.9 10758 5307373312
char_of_integer Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(652,1)-(678,4) 14.5 44.2 5544 14712247808
integer_of_char Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(699,1)-(711,14) 12.7 7.2 4829 2382942800
equal_char Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(5846,1)-(5849,74) 10.6 0.0 4049 0
of_bool Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(681,1)-(682,21) 4.8 0.0 1825 32
explode Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(10699,1)-(10702,8) 4.7 11.7 1803 3902383424
remove_comments_aux Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(11012,1)-(11027,30) 1.8 2.5 682 833716032
less_eq_nat Ceta CeTA-2.41/generated/Haskell/Ceta.hs:1102:1-54 1.2 0.0 450 96
take Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(9844,1)-(9846,74) 0.8 1.2 293 389883824
many Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(11097,1)-(11100,24) 0.7 1.8 284 600852640
char_of_integer.a Ceta CeTA-2.41/generated/Haskell/Ceta.hs:668:33-54 0.7 1.8 284 585357312
main.\ Main src/Main.hs:(19,30)-(46,102) 0.6 2.1 247 694727640
bindb Ceta CeTA-2.41/generated/Haskell/Ceta.hs:(11038,1)-(11041,14) 0.6 1.3 224 419442280
explode.ord Ceta CeTA-2.41/generated/Haskell/Ceta.hs:10701:15-53 0.1 1.2 43 390238208
Hm. That code is coming straight from Isabelle: https://isabelle.in.tum.de/dist/library/HOL/HOL/String.html#String.char_of_integer_def|fact
CeTA-2.42 (just released) improved the encoding of characters, and this greatly reduces run-times.
E.g., down from 13.7 seconds, to 3.6 seconds:
ceta-postproc data/53216.xml.cpf data/53216.xml
starexec-result="YES"
certification-result="CERTIFIED"
certification-time="13.7"
output-size="17211756"
~/.cabal/bin/ceta-postproc data/53216.xml.cpf data/53216.xml
starexec-result="YES"
certification-result="CERTIFIED"
certification-time="3.6"
output-size="17211756"
I will add profiling data.
profiling data confirms it's much better now, see https://github.com/jwaldmann/ceta-postproc/tree/master/data/ceta-2.42
Tue Jun 14 17:22 2022 Time and Allocation Profiling Report (Final)
ceta-postproc +RTS -P -H -RTS data/53216.xml.cpf data/53216.xml
total time = 7.92 secs (7919 ticks @ 1000 us, 1 processor)
total alloc = 9,385,493,864 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc ticks bytes
equal_char Ceta CeTA-2.42/generated/Haskell/Ceta.hs:5871:1-55 12.8 0.0 1015 240
integer_of_ci Ceta CeTA-2.42/generated/Haskell/Ceta.hs:691:1-36 7.6 0.0 598 496
integer_of_char Ceta CeTA-2.42/generated/Haskell/Ceta.hs:694:1-48 6.5 0.0 511 0
remove_comments_aux Ceta CeTA-2.42/generated/Haskell/Ceta.hs:(11151,1)-(11163,30) 4.3 8.9 339 833716048
explode Ceta CeTA-2.42/generated/Haskell/Ceta.hs:(10836,1)-(10839,8) 3.8 22.9 303 2146311488
bindc Ceta CeTA-2.42/generated/Haskell/Ceta.hs:11183:1-47 2.7 0.0 215 48
rep_isom Ceta CeTA-2.42/generated/Haskell/Ceta.hs:667:1-57 2.6 10.4 202 975595520
char_integer_of_integer_aux Ceta CeTA-2.42/generated/Haskell/Ceta.hs:(671,1)-(674,19) 2.2 4.2 177 390238208
bindb Ceta CeTA-2.42/generated/Haskell/Ceta.hs:(11174,1)-(11177,14) 2.2 4.5 174 419442280
exactly_aux Ceta CeTA-2.42/generated/Haskell/Ceta.hs:(11285,1)-(11290,69) 2.1 2.5 170 233918528
Hm ... equal_char
at the top. Seems natural, it's a parser, so it wants to check the next character. But - efficient parsers (lexers) would use a jump table instead (indexed by the next character)?
see also #25
for verifying a long-ish SRS loop:
input and full output: see data/ directory (53216. and babababaa-aaaabababababa.)
executable built with