Closed lstrz closed 6 years ago
I've constructed a small example that demonstrates the weird behavior well. hwme.zip
Output is:
a: 2 b: 3 c: 0 d: 10
a: 2 b: 2 c: 0 d: 10
a: 2 b: 3 c: 0 d: 10
a: 2 b: 3 c: 0 d: 10
a: 2 b: 3 c: 0 d: 4
a: 2 b: 3 c: 0 d: 10
a: 2 b: 3 c: 0 d: 10
In the fifth row, the 4 should be a 10 instead.
Hi @lstrz, I will gladly check your example when possible. It is possible that some parts of the accelerator, especially the datapath, contain a few bugs -- this specific accelerator is provided as an example so it's not thoroughly tested. I am more confident for what concerns the hwpe-stream
and hwpe-ctrl
IPs, although the versions used here could be (very slightly) outdated.
I'll let you know.
Sounds good! I wasn't sure if I was doing something wrong or not, so while at it, I posted the example for narrowing down potential bugs in IPs you care about, as I have little experience in digital design.
Ok I found and fixed the issue, which has always been there. One of the common cases (two simultaneous valid handshakes in different streams) was not properly checked in the datapath.
It was not previously exposed (some other change in the platform must have brought it out by slightly changing timings), but actually it impacted also my own test. It is now fixed in commit 087a7f3
of the HWME module, which you should get if you update the platform IPs with update-ips
.
I also took the occasion to update hwpe-stream
and hwpe-ctrl
. They are not involved in this bug, but the new version should be anyways more robust, I've been recently working on a lot of testing on these versions.
Perfect. Thank you very much!
I'm trying out the HWME in scalar product mode and have started with some very simple examples. I am using the default microcode supplied in the pulp-rt-example/accelerators/hwme example.
Job dependent parameters are:
All the results should be the same because 0 for vectstride means I'm always reading the same (first) vector from my a and b inputs. The a, b, c and d arrays are more than 300 bytes apart, so there's no overlap.
This is what the first 15 elements of a, b, c and d look like before and after running the HWME.
The scalar product of the two vectors is 5590: (4068 1968 - 3079 741) / 1024 = 5590. What I don't understand is where does -2229 come from.
Did I misunderstand any of the configuration options?