Samsung / walrus

WebAssembly Lightweight RUntime
Apache License 2.0
35 stars 10 forks source link

Fix benchmarker csv format, time and memory results output at same time + Emscripten O2 #248

Closed vorosl closed 2 weeks ago

vorosl commented 1 month ago

I've added -O3 flag to Emscripten

zherczeg commented 1 month ago

x86 64 bit results on my machine:

        Test case        |     interpreter  |    baseline_jit  |    regalloc_jit  |
-------------------------+------------------+------------------+------------------+
                 change  |           5.42s  |   2.14s (2.54x)  |   1.98s (2.74x)  |
              factorial  |           6.48s  |   1.28s (5.07x)  |  0.38s (17.21x)  |
               fannkuch  |           5.93s  |   1.38s (4.29x)  |   0.75s (7.89x)  |
              fibonacci  |           5.44s  |   3.37s (1.61x)  |   3.25s (1.67x)  |
                gregory  |           5.25s  |   3.19s (1.65x)  |   1.80s (2.91x)  |
                  hanoi  |           8.07s  |   2.20s (3.67x)  |   1.49s (5.43x)  |
               heapsort  |           5.56s  |   1.61s (3.46x)  |   0.73s (7.65x)  |
                huffman  |           4.41s  |   1.24s (3.55x)  |   0.50s (8.77x)  |
            kNucleotide  |           4.89s  |   1.62s (3.02x)  |  0.47s (10.31x)  |
        mandelbrotFloat  |         100.53s  |  10.62s (9.46x)  |  3.11s (32.30x)  |
       mandelbrotDouble  |           7.49s  |   4.76s (1.57x)  |   2.35s (3.19x)  |
         matrixMultiply  |          10.18s  |   2.13s (4.78x)  |  0.67s (15.26x)  |
             miniWalrus  |           2.63s  |   0.62s (4.26x)  |   0.28s (9.50x)  |
                  nbody  |           5.42s  |   1.16s (4.68x)  |   0.66s (8.27x)  |
                nqueens  |           6.34s  |   1.29s (4.90x)  |   0.78s (8.17x)  |
                  prime  |           5.11s  |   1.62s (3.15x)  |  0.25s (20.45x)  |
              quickSort  |           5.82s  |   1.53s (3.81x)  |   0.62s (9.33x)  |
               redBlack  |           0.01s  |   0.01s (0.69x)  |   0.01s (0.62x)  |
                    rsa  |           6.30s  |   3.47s (1.81x)  |   2.69s (2.34x)  |
               salesman  |           6.13s  |   1.24s (4.96x)  |  0.59s (10.44x)  |
    simdMandelbrotFloat  |           6.11s  |   1.50s (4.09x)  |  0.36s (16.89x)  |
   simdMandelbrotDouble  |           5.78s  |   1.28s (4.51x)  |  0.43s (13.54x)  |
              simdNbody  |           4.76s  |   1.10s (4.33x)  |  0.37s (13.01x)  |
     simdMatrixMultiply  |           5.33s  |   1.75s (3.04x)  |   0.69s (7.70x)  |
              ticTacToe  |           5.87s  |   1.73s (3.40x)  |   1.25s (4.70x)  |
-------------------------+------------------+------------------+------------------+
     Average speedup     |                  |           3.69x  |           9.61x  |

Most tests run 4-6s with interpreter, so that is pretty good. Few runs faster, but some are much slower, e.g. mandelbrotFloat is 100s. It seems redblack does not work at all, even with interpreter. Probably #242 hit again.

I will check 32 bit next.

zherczeg commented 1 month ago

32 bit results, needs #252

        Test case        |     interpreter  |    baseline_jit  |    regalloc_jit  |
-------------------------+------------------+------------------+------------------+
                 change  |           7.46s  |   2.95s (2.53x)  |   2.45s (3.05x)  |
              factorial  |           8.40s  |   1.50s (5.61x)  |  0.64s (13.21x)  |
               fannkuch  |           7.11s  |   1.68s (4.23x)  |   0.75s (9.46x)  |
              fibonacci  |           8.69s  |   5.18s (1.68x)  |   5.12s (1.70x)  |
                gregory  |           9.43s  |   3.61s (2.61x)  |   2.61s (3.61x)  |
                  hanoi  |          10.10s  |   3.05s (3.31x)  |   2.42s (4.17x)  |
               heapsort  |           6.57s  |   1.60s (4.11x)  |   1.00s (6.60x)  |
                huffman  |           5.10s  |   1.09s (4.68x)  |   0.58s (8.74x)  |
            kNucleotide  |           6.05s  |   1.61s (3.76x)  |  0.38s (16.10x)  |
        mandelbrotFloat  |          97.47s  |  10.95s (8.90x)  |  3.74s (26.09x)  |
       mandelbrotDouble  |          14.93s  |   7.23s (2.06x)  |   5.10s (2.92x)  |
         matrixMultiply  |          11.98s  |   1.91s (6.26x)  |  0.87s (13.84x)  |
             miniWalrus  |           3.17s  |   0.46s (6.86x)  |   0.39s (8.07x)  |
                  nbody  |           6.73s  |   1.44s (4.67x)  |   0.75s (8.98x)  |
                nqueens  |           7.94s  |   1.33s (5.99x)  |   0.87s (9.09x)  |
                  prime  |           6.33s  |   1.76s (3.59x)  |   0.97s (6.54x)  |
              quickSort  |           6.96s  |   1.30s (5.36x)  |  0.67s (10.41x)  |
               redBlack  |           0.01s  |   0.01s (0.67x)  |   0.02s (0.64x)  |
                    rsa  |           7.62s  |   4.72s (1.62x)  |   4.40s (1.73x)  |
               salesman  |           7.55s  |   1.31s (5.77x)  |   0.89s (8.51x)  |
    simdMandelbrotFloat  |           7.65s  |   1.26s (6.06x)  |  0.30s (25.41x)  |
   simdMandelbrotDouble  |          11.30s  |  1.05s (10.81x)  |  0.55s (20.58x)  |
              simdNbody  |           5.83s  |   1.23s (4.75x)  |  0.45s (12.80x)  |
     simdMatrixMultiply  |           7.75s  |   1.58s (4.90x)  |  0.51s (15.31x)  |
              ticTacToe  |           7.59s  |   1.71s (4.43x)  |   1.67s (4.54x)  |
-------------------------+------------------+------------------+------------------+
     Average speedup     |                  |           4.61x  |           9.68x  |
vorosl commented 1 month ago

It depends on https://github.com/Samsung/walrus/pull/255

clover2123 commented 3 weeks ago

BTW who is the author among @vorosl @vlacko0930 ?

vorosl commented 3 weeks ago

BTW who is the author among @vorosl @vlacko0930 ?

@vlacko0930 is my private user, but the git client mixing unfurtunately. :( Maybe I've fixed it yet.

vorosl commented 2 weeks ago

https://github.com/Samsung/walrus/commit/31ec7d68ea845029cf54ba86af95d812a25e5ed5 commit causes the problem at simdNbody

vorosl commented 2 weeks ago

31ec7d6 commit causes the problem at simdNbody

It is fixed in https://github.com/Samsung/walrus/pull/266/