bytecodealliance / wasm-micro-runtime

WebAssembly Micro Runtime (WAMR)
Apache License 2.0
4.84k stars 618 forks source link

[Fast Interpreter] Unstable execution time when running stable code #2171

Open hungryzzz opened 1 year ago

hungryzzz commented 1 year ago

Description

Hi, I run the following attached case in Wamr with fast-interpreter mode and find the execution time is unstable. The execution time of this case should be stable because the code has no randomness: the main thing the code does is a matrix operations, C := alpha*A*B + beta*C, and the initializations of the arrays are all fixed.

#include <stdio.h>
#include <polybench.h>
#define M 1024

int alpha = 32412;
int beta = 2123;

// init array
void init_array(double A[][M], double B[][M], double C[][M]) {
  for (int i = 0; i < M; ++i)
    for (int j = 0; j < M; ++j) {
      A[i][j] = ((double)i * j) / M;
      B[i][j] = ((double)i * (j+1)) / M;
      C[i][j] = ((double)i * (j+2)) / M;
    }
}

/* C := alpha*A*B + beta*C */
void compute_array(double A[][M], double B[][M], double C[][M]) {
  for (int i = 0; i < M; ++i)
    for (int j = 0; j < M; ++j) {
      C[i][j] *= beta;
      for (int k = 0; k < M; ++k)
        C[i][j] += A[i][k] * B[k][j];
    }
}

// prevent DCE
void print_array(double C[][M]) {
  double t = 0;
  for (int i = 0; i < M; ++i) {
    for (int j = 0; j < M; ++j)
      t += C[i][j];
    t /= M;
  }
  printf("%lf\n", t);
}

int main() {
  double (*A)[M] = (double(*) [M])malloc(sizeof(double) * M * M);
  double (*B)[M] = (double(*) [M])malloc(sizeof(double) * M * M);
  double (*C)[M] = (double(*) [M])malloc(sizeof(double) * M * M);
  init_array(A, B, C);
  polybench_start_instruments;
  compute_array(A, B, C);
  polybench_stop_instruments;
  polybench_print_instruments;
  print_array(C);
  return 0;
}

I run the case in Wamr with fast-interpreter mode and Wasm3 30 times respectivevly. The execution time of Wamr(fast-interp) is unstable, which is ranging from 22.52s to 38.56s; while the execution time in Wasm3 is almost stable around 15s-17s. The details are as follow:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Wamr(fast-interp) 31.44 34.71 38.56 27.65 35.38 28.21 23.03 23.39 38.38 22.54 22.52 29.20 24.15 36.25 37.53 22.72 36.57 28.15 23.26 33.24 31.50 36.97 31.81 35.97 28.54 38.89 28.85 31.41 23.12 28.99
Wasm3 16.44 16.14 15.97 15.72 15.64 15.22 16.20 16.34 16.51 15.95 15.78 16.03 15.87 16.15 16.18 15.87 15.38 16.51 16.24 16.04 16.30 17.16 16.20 17.15 15.80 16.77 16.51 16.40 16.28 17.30

Hardware & OS

Emscripten

Wasm runtime version

Repreduce

  1. Compile the above C case using Emscripten emcc -sENVIRONMENT=shell -O2 -s WASM=1 -s TOTAL_MEMORY=512MB -I ./case -DPOLYBENCH_TIME test.c -o test.wasm

  2. Execute the wasm file in different wasm runtimes and collect the execution time, all the compilation and execution options are default.

case.zip

TianlongLiang commented 1 year ago

Hi, I tried your case, and it's stable on my end. Maybe you downloaded the iwasm binary from the release? In the release, the iwasm is compiled with multi-tier JIT running modes. In that case, the execution time variation came from Fast JIT to LLVM JIT tier-up.

wenyongh commented 1 year ago

@hungryzzz The released iwasm binary was built with classic interpreter since fast-jit/llvm-jit uses the similar data structure and module instance layout with classic interpreter, so iwasm --interp runs with classic-interpreter mode but not fast-interpreter mode. To test fast interpreter, please build iwasm manually:

cd product-mini/platforms/linux
mkdir build && cd build && cmake ..
make -j 4
# iwasm was generated under current directly with fast-interp/AOT mode enabled
hungryzzz commented 1 year ago

@wenyongh @TianlongLiang Hi, I build the wamr from source code (commit 16ecd17a5b484dcc6e6c4f44c8dc13c54d216350), and cmake options are default(fast-interp mode).

TianlongLiang commented 1 year ago

That's strange. Do you have any background processes running? Also, you can upload your iwasm and I can have a try

hungryzzz commented 1 year ago

I've reboot twice times, and it remains the same. And the attachment is my iwasm. Thank you!

build-fast-interp.zip

TianlongLiang commented 1 year ago

It's still stable on my end, can't reproduce this behavior, I tried it in a loop:

Running iteration 1

real    0m17.310s
user    0m17.302s
sys 0m0.008s
Running iteration 2

real    0m17.036s
user    0m17.035s
sys 0m0.000s
Running iteration 3

real    0m17.399s
user    0m17.394s
sys 0m0.004s
Running iteration 4

real    0m18.278s
user    0m18.274s
sys 0m0.004s
Running iteration 5

real    0m17.058s
user    0m17.049s
sys 0m0.008s
Running iteration 6

real    0m18.479s
user    0m18.474s
sys 0m0.004s
Running iteration 7

real    0m17.487s
user    0m17.474s
sys 0m0.012s
Running iteration 8

real    0m17.522s
user    0m17.514s
sys 0m0.008s
Running iteration 9

real    0m17.718s
user    0m17.706s
sys 0m0.012s
Running iteration 10

real    0m18.631s
user    0m18.626s
sys 0m0.004s
hungryzzz commented 1 year ago

Hi, @TianlongLiang I've retried it on another server and the execution time is stable. Maybe there is something wrong with the previous server, it's really strange, I don't know why does this happen. Thank you very much!

TianlongLiang commented 1 year ago

You are welcome, I am glad it works out in the end.