[x] Fixing minor bugs (data type mismatching, HCL runtime API, etc.)
[x] Update test cases to support floating dtype
[x] Added test cases for Stencil FIFO connection
[x] Automatically generate data (de)serializer in host code to improve the throughput when generating AutoSA module with off-chip memory access. This also helps solving the deadlock somehow.
[x] Added SA examples for Intel and Xilinx backends
Extend AutoSA generated code to support module composing
Right now the L3 IO module in AutoSA reads/writes to off-chip memory. We want to extend it to connect to other on-chip hardware modules)
[x] Connect SA with HCL modules thru array interface.
[x] Connect SA with HCL modules thru FIFO interface.
Extend AutoSA generated code to support module composing
Right now the L3 IO module in AutoSA reads/writes to off-chip memory. We want to extend it to connect to other on-chip hardware modules)