Open jessewjx opened 4 years ago
Hi jessewjx,
The version you are using does not really work on large matrix. For 1024x1024 GEMM, It tries to implement a systolic array with 1024x1024 PEs, which is impossible to be implemented on a single FPGA.
We have a new version that tiles the matrix into smaller pieces and performs computation one by one -- e.g. it tiles a 1024x1024 matrix into a group of 32x32 matrices, passes these sub-matrices to a 32x32 systolic array, and finally sums the results up. I am still testing it and will upload the new version within a few days. Will keep you posted.
Please make sure v++ is on the PATH by running which v++
. You should set up the XRT and Vitis environment before using HeteroCL's vitis backend.
Hi Hecmay,
Thank you for the reply and looking forward to the the systolic array based tiled matrix multiplication implementation! Also, we are wondering whether it is possible to achieve 4096x4096 matrix vector multiplication with the implementation you mentioned? thanks!
Yeah. You can parameterize the algorithm as matrix A (4096x4096) multiplied with B (4096x1).
Great! that is very inspiring and looking forward to the release of the new version implementation of systolic array based matrix multiplication!
Hi @jessewjx. Please see the systolic gemm here: https://github.com/Hecmay/heterocl/blob/fix/samples/systolic_array/systolic_array_module_stream.py
It may take a long time for this PR to be merged. You can simply pull back and compile the fix
branch to give it a try: https://github.com/Hecmay/heterocl/blob/fix/
Thank you @Hecmay and we will have a try with the systolic gemm!
Hi @Hecmay, I tried to run the systolic gemm on the Vitis platform. Probably due to some setup issue, “vitis platform information missing ” occurred. Please note that the vitis environment is already set up on the server.
I then generated Vivado_HLS C++ code by setting the target to “vivado_hls” and ran the C simulation. I tested the systolic array implementation with the matrix vector multiplication with matrix size=32x32 and systolic size=4 .
The resulting vector does not agree with the simple matrix vector product. My hunch is that this could have happened because I was not executing the program correctly or that there is a minor bug in the current implementation.
Could you advice on this issue? It would be great if you could give some rough guides on how to execute the program as well. Thanks for writing this nice piece of software, hoping to hear from you soon.
Hi ! I am deploying the heteroCL on the Vitis platform and running the systolic array based matrix multiplication sample systolic_array_vitis.py. Here are three issues I came across, wondering whether we can possibly ask for some suggestions?
After changing the dimension of the local matrix to be consistent with the input matrix, the inconsistency is solved.
compile ="vivado_hls"
, the synthesis process can be finished otherwisevitis platform info missing
is reported.