Open ngdymx opened 2 months ago
Hey @ngdymx, thank you for filing this issue! Your testing so far is great to help us isolate the problem.
I am suspicious that the lock pattern you tried only works for AIE1 architecture devices and not AIE2 architecture ones. Could you try instead of using a single lock for your shared buffer to use two locks: a producer lock which is initialized with 1, and a consumer lock initialized with 0. The use_lock
calls will need to be updated as well.
Here is how that would look:
% lock_a04_prod = aie.lock(%tile_0_4, 0) {init = 1 : i32, sym_name = "lock_a04_prod"}
% lock_a04_cons = aie.lock(%tile_0_4, 1) {init = 0 : i32, sym_name = "lock_a04_cons"}
// on %tile_1_4:
aie.use_lock(%lock_a04_prod, AcquireGreaterEqual, 1)
// function call
aie.use_lock(%lock_a04_cons, Release, 1)
// on %tile_1_4:
aie.use_lock(%lock_a04_cons, AcquireGreaterEqual, 1)
// function call
aie.use_lock(%lock_a04_prod, Release, 1)
After going over the mlir-tutorials, I realize that only AIE1 locks are documented. I will take an action item to update them.
In the meantime, you can double check your lock logic by studying the output of the objectfifo lowering pass, which you can call with:
aie-opt --aie-objectFifo-stateful-transform <name of your mlir file>
You can also comb through some of the already existing mlir examples in test/objectFifo-stateful-transform/
It can be a little bit tedious but on small examples it can be insightful.
Hi team,
I am working on using
aie.use_lock
to control the write and read sequence for sharing a buffer between two ComputeTiles. However, I’ve encountered an issue that I’m unsure how to resolve.Could you please assist me with this? I would greatly appreciate your help.
I have configured the ComputeTiles (1, 4) and (0, 4) as follows: the
Add one function
is implemented on tile (1, 4), while thepassthrough function
is placed on tile (0, 4). The attached figure illustrates the dataflow diagram for this setup.I mimic the
aie.mlir
under the folder/mlir-aie/mlir_tutorials/tutorial-3
to use theaie.use_lock
.The input data is:
The output I am expecting to see is as follows:
The output I got:
After obtaining this result, I modified the
passthrough function
on tile (0, 4) to anadd_i function
, wherei
is the index parameter of the for loop, the range ofi
is 1 to 4 step 1. This change was made to test the running time of tile (0, 4).The result I got:
I have confirmed that the running time for tile (0, 4) is 4, which matches our expectations.
Then, I modified the
add one function
on tile (1, 4) to anadd_i function
to test the running time of tile (1, 4).The result I got:
In this case, the running time for tile (1, 4) appears to occur only once. However, since tile (0, 4) operates successfully, the
aie.use_lock
acquire and release operations on tile (1, 4) should also function correctly. This suggests that the lock acquire/release on tile (1, 4) are actually ran for four times, but the function (kernel) is only being called once.Would you be able to help me look into this matter? I would greatly appreciate it.
The following the main code:
Function code:
aie.mlir
Host: