Open wangyongj1a opened 5 months ago
@llvm/issue-subscribers-mlir
Author: None (wangyongj1a)
I made a small change where I cast the result of function call to f32 type, and made the function return the casted value to satisfy mlir-cpu-runner. test.mlir:
module {
func.func @func0(%arg0: tensor<19xi32>) -> i32{
%0 = arith.constant 2 : index
%1 = tensor.extract %arg0[%0] : tensor<19xi32>
vector.print %1 : i32
return %1 : i32
}
func.func private @func1() -> f32 {
%0 = arith.constant 10 : i32
%1 = tensor.from_elements %0 : tensor<1xi32>
%2 = tensor.from_elements %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0, %0 : tensor<19xi32>
%3 = call @func0(%2) : (tensor<19xi32>) -> i32
%4 = arith.sitofp %3 : i32 to f32
return %4 : f32
}
}
When I ran /data/tmp/v1029/llvm-project/build/bin/mlir-opt --one-shot-bufferize=dialect-filter=tensor,bufferization --convert-scf-to-cf --convert-cf-to-llvm --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --one-shot-bufferize=dialect-filter=tensor --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts test.mlir | /data/tmp/v1029/llvm-project/build/bin/mlir-cpu-runner -e func1 --shared-libs=/data/tmp/v1029/llvm-project/build/lib/libmlir_runner_utils.so,/data/tmp/v1029/llvm-project/build/lib/libmlir_c_runner_utils.so
on the program, I got the result of:
10
1.000000e+01
However, when I ran /data/tmp/v1029/llvm-project/build/bin/mlir-opt --one-shot-bufferize=dialect-filter=tensor,bufferization --buffer-deallocation --convert-scf-to-cf --convert-cf-to-llvm --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --one-shot-bufferize=dialect-filter=tensor --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts test.mlir | /data/tmp/v1029/llvm-project/build/bin/mlir-cpu-runner -e func1 --shared-libs=/data/tmp/v1029/llvm-project/build/lib/libmlir_runner_utils.so,/data/tmp/v1029/llvm-project/build/lib/libmlir_c_runner_utils.so
on the program, I got inconsistent results over multiple runs.
This problem seems to still exist. I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused these results. My git version is e19a5fc6d306a81d181a9597a8b25c444c08d722.
I think the problem originates from this series of passes--one-shot-bufferize=dialect-filter=tensor,bufferization --buffer-deallocation --func-bufferize
. This results in (full IR here:
%0 = bufferization.to_tensor %alloc_0 : memref<19xi32>
%1 = bufferization.to_memref %0 : memref<19xi32>
memref.dealloc %alloc_0 : memref<19xi32>
%2 = call @func0(%1) : (memref<19xi32>) -> i32
which gets folded to (full IR here):
memref.dealloc %alloc : memref<19xi32>
%0 = call @func0(%alloc) : (memref<19xi32>) -> i32
I think it might be problematic to run deallocation before bufferizing all ops. But It could also be that to_tensor
-> to_memref
can't be folded (instead replaced with an allocation and copy). But I'm not sure, hopefully this helps provide some context to others can help
I have the following MLIR program: test.mlir:
When I tried to lower the program with
mlir-opt --tensor-bufferize --buffer-deallocation --convert-scf-to-cf --convert-cf-to-llvm --func-bufferize --convert-func-to-llvm --convert-index-to-llvm --convert-vector-to-llvm --finalize-memref-to-llvm --convert-arith-to-llvm --reconcile-unrealized-casts test.mlir
, and executed the executable file, I got inconsistent results over multiple runs. I noticed that after using the passes--tensor-bufferize --buffer-deallocation
, the program was lowered to:The memory of
%alloc_0
seems to be deallocated before the related tensor%0
was used as an attribute in the function call. I also tried to use the pass--buffer-deallocation-simplification
after--buffer-deallocation
, but it seems couldn't help with this case. I'm not sure if there is any bug in my program or the wrong usage of--buffer-deallocation
and--buffer-deallocation-simplification
that caused this error. My git version is 4c79d38f82e1f6fe8575d88d8c74f2f1806b19ce.