Open 07bcbce5-af49-4a2a-83e4-483b07325df6 opened 3 years ago
Given that you are trapping in polybench_alloc_data, are you sure that you are allowing memory to be allocated in the right way? I tried this out using emscripten and node.js.
I'm not too familiar with wasi's clang and wasmtime, but in emscripten you need to ensure you allocate a large enough wasm memory size (in emscripten the default is 16M IIRC, which is not enough for this example). If you set -s INITIAL_MEMORY=256MB or -s ALLOW_MEMORY_GROWTH=1 it does work, both with and without polly.
With bare clang, you can set the initial and max memory with the linker's --initial-memory and --max-memory flags (e.g. -Wl,--initial-memory=16777216 -Wl,--max-memory=2147483648 ) and you can set the stack size with the -z stack-size flag. Probably adding -v to your clang command line will show you what it uses by default.
I already tried to get wasm to compile and run but requires too many other things (backend, libc replacement, javascript, wasi, ...) and even then I still have no idea else I could do than you.
The minimal code looks like the gemm optimization would trigger this, where I just fixed a bug: llvm/llvm-bugzilla-archive#50557 Also try switching off that optimization (-mllvm -polly-pattern-matching-based-opts=0).
Does wasm limit the stack size? If yes, that might be the cause because above optimization allocates temporary arrays on the stack.
More information could also be helpful. What is the error with the reduced code (since there is no polybench_alloc_data)? What is the output with WASMTIME_BACKTRACE_DETAILS=1? -mllvm -debug-only=polly-ast? -mllvm -polly-codegen-add-debug-printing? -mllvm -polly-codegen-trace-scalars -mllvm -polly-codegen-trace-stmts? Can you reduce the matrix size as well? Can you reproduce it with the legacy pass manager? -polly-position=early? ....
Hi Michael,
I investigated more on this issue, and I came with the following "minimal" example that reproduce the problem :
#include <stdlib.h>
#define ni 800
#define nj 900
#define nk 1100
int main() {
double (*A)[nk] = malloc(sizeof(double[ni][nk]));
double (*B)[nj] = malloc(sizeof(double[nk][nj]));
double (*D)[nj] = malloc(sizeof(double[ni][nj]));
int i, j, k;
for (i = 0; i < ni; i++) {
for (j = 0; j < nj; j++){
for (k = 0; k < nk; ++k)
D[i][j] += A[i][k] * B[k][j];
}
}
return 0;
}
I am on LLVM 1fbb484ea45f, and compiling this example (mm.c file) with the following line leads to a wasm file that I can execute :
clang-13 -O2 mm.c --sysroot wasi-sysroot --target=wasm32-wasi -o mm-wasm
Nevertheless, when I add Polly, the resulting wasm file leads to a runtime error.
clang-13 -O2 -mllvm -polly mm.c --sysroot wasi-sysroot --target=wasm32-wasi -o mm-polly-wasm
Also I have not been able to get the IR file associated to each one of these 2 versions.
Do you mind to try reproducing the issue ?
Thank,
-- Manu
Support for -polly-dump-before with the NPM has recently been added: https://github.com/llvm/llvm-project/commit/29bef8e4e3593ab37c4d3b6289dcdec961c3fb52
Unfortunately, because of how extension points work in the NPM, it it only possible with -polly-position=early. An alternative is NPM's -print-before
option (https://reviews.llvm.org/D87216). However, it only prints to dbgs()
and only the function (not the entire module), both making getting a reproducer difficult.
I tried to compile&run 2mm as WebAssembly, but stopped after this taking too much time.
Judging from the backtrace, this doesn't seem to be an issue in Polly. The crash occurs in polybench_alloc_data, which is implemented in polybench.c. It doesn't even contain a loop, hence not a subject of optimizations by Polly.
Additional info that could help:
- Output files of -polly-dump-before/-polly-dump-after
- Output of -mllvm -debug-only=polly-detect,polly-scops,polly-opt-isl,polly-ast
- Selectively optimizing specific functions, e.g. -polly-only-func=init_array, --polly-only-func=kernel_2mm or -polly-only-func=polybench_alloc_data
Hi Michael,
I am investigating more on this issue, and I am not able to run some polly passes because of the following kind of errors :
error in backend: Option -polly-dump-before not supported with NPM
I am running Polly directly from the clang driver, using -mllvm to specify polly passes. I tried to set the old pass manager, but this changes the behavior of the programs outputed by Polly. Do you confirm that I should stay with the new pass manager ? If yes, how can I make passes such as polly-dump-before work with the new pass manage ?
Thank you again for your help.
You can change the Product to "libraries" and Component to "Backend: Webassembly". Doesn't hurt to add some developers + the Backend's code owner the CC list.
Alternatively, create a new bug entry for "Backend: Webassembly". When the faulty component is identified, the other bug entry can be set as a duplicate of the other one.
Try to reduce the reproducer. E.g. but everything into one file, remove functions/statements, generate LLVM-IR before or after optimizations.
Hi Michael,
Thank you for your answer. As you suggest, I don't think the bug is on the Polly side but more on the Wasm back-end one.
I'll provide tomorrow the additional information you suggested.
Also, how should I modify this bug entry so that Wasm backend people are aware of it ?
-- Manu
I tried to compile&run 2mm as WebAssembly, but stopped after this taking too much time.
Judging from the backtrace, this doesn't seem to be an issue in Polly. The crash occurs in polybench_alloc_data, which is implemented in polybench.c. It doesn't even contain a loop, hence not a subject of optimizations by Polly.
Additional info that could help:
Extended Description
I am compiling the polybench to WebAssembly using clang. Activating Polly in this compilation path lead to a runtime error when executing the generated wasm code for the 2mm, 3mm and gemm benchmarks.
Here is the error, basically the code tries to access memory out of what has been allocated :
Note that the error occurs with all the WebAssembly runtimes that we tried.
Maybe the issue is not on the Polly side, but in the WebAssembly backend one. I didn't found a way to attach this request also to another product so that the people from the backend could be notified too. Please let me know how to do that.
Finally, here are some information to reproduce the bug. Please let me know if I can provide anything else that maybe useful for you to identify what is going on.
llvm commit : 1fbb484ea45f85740b7450b175096e5fcff6ecd9
compilation command (include and link options omitted) : clang -O3 -mllvm -polly -2mm.c -DLARGE_DATASET -DPOLYBENCH_TIME --target=wasm32-wasi -o 2mm-wasm-polly
Thank you for the support,
-- Manu