cornell-zhang / heterocl

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Heterogeneous Computing
https://cornell-zhang.github.io/heterocl/
Apache License 2.0
326 stars 92 forks source link

[Backend][SystemC] Stratus HLS SystemC Codegen Backend Support #375

Closed zzzDavid closed 2 years ago

zzzDavid commented 3 years ago

Stratus HLS SystemC Backend Support

This PR implements the Stratus HLS SystemC backend, with a few other enhancements.

Common Changes

Usage Example: Binary Convolution

HeteroCL code

import heterocl as hcl
A = hcl.placeholder((1, 32, 14, 14), dtype=hcl.UInt(1), name="A")
B = hcl.placeholder((64, 32, 3, 3), dtype=hcl.UInt(1), name="B")
rc = hcl.reduce_axis(0, 32)
ry = hcl.reduce_axis(0, 3)
rx = hcl.reduce_axis(0, 3)
C = hcl.compute((1, 64, 12, 12),
    lambda nn, ff, yy, xx: hcl.sum(
        A[nn, rc, yy + ry, xx + rx] * B[ff, rc, ry, rx], axis=[rc, ry, rx]),
    dtype=hcl.UInt(8), name="C")
s = hcl.create_schedule([A, B, C])
s[C].reorder(C.axis[1], C.axis[0])
s[C].split(C.axis[1], factor=5)
code = hcl.build(s, target='shls')

Generated Stratus HLS SystemC code

default_function.h

#include <cynw_p2p.h>
SC_MODULE(default_function)
{
  sc_in<bool> clk;
  sc_in<bool> rst;

  // port definitions
  cynw_p2p < sc_uint<1> >::in   A;
  cynw_p2p < sc_uint<1> >::in   B;
  cynw_p2p < sc_uint<8> >::out  C;

  sc_int<32> sum;

  void thread1();

  SC_CTOR( default_function )
  : clk( "clk" )
  , rst( "rst" )
  , C( "C" )
  , A( "A" )
  , B( "B" )
  {
    SC_CTHREAD(thread1, clk.pos());
    reset_signal_is(rst, 0);
    C.clk_rst(clk, rst);
    A.clk_rst(clk, rst);
    B.clk_rst(clk, rst);
  }

};

default_function.cc

#include "default_function.h"
void default_function::thread1()
{
  {
    HLS_DEFINE_PROTOCOL("reset");
    C.reset();
    A.reset();
    B.reset();
    wait();
  }
  while( true )
  {
    {
      HLS_DEFINE_PROTOCOL( "default_function_read_protocol" );
    }

    {
      C_ff_outer: for (sc_int<32> ff_outer = 0; ff_outer < 13; ++ff_outer) {
        C_ff_inner: for (sc_int<32> ff_inner = 0; ff_inner < 5; ++ff_inner) {
          C_nn: for (sc_int<32> nn = 0; nn < 1; ++nn) {
            C_yy: for (sc_int<32> yy = 0; yy < 12; ++yy) {
              C_xx: for (sc_int<32> xx = 0; xx < 12; ++xx) {
                if (ff_inner < (64 - (ff_outer * 5))) {
                  sum_x: for (sc_int<32> x = 0; x < 1; ++x) {
                    sum = 0;
                  }
                  C_ra0: for (sc_int<32> ra0 = 0; ra0 < 32; ++ra0) {
                    C_ra1: for (sc_int<32> ra1 = 0; ra1 < 3; ++ra1) {
                      C_ra2: for (sc_int<32> ra2 = 0; ra2 < 3; ++ra2) {
                        sum = ((sc_int<32>)(((sc_int<34>)(((sc_uint<2>)A.get()) * ((sc_uint<2>)B.get()))) + ((sc_int<34>)sum)));
                      }
                    }
                  }
                  C.put(((sc_uint<8>)sum));
                }
              }
            }
          }
        }
      }
    }

    {
      HLS_DEFINE_PROTOCOL( "default_function_write_protocol" );
    }
  }
}
hecmay commented 2 years ago

@zzzDavid make sure to format the cpp files using clang-format before pushing. The clang-format configuration file is located under docs folder.

The local runner is terminated smh. I will restart it now. You may consider adding the CI/CD tests to this PR. it is also fine to put it in another PR.

zzzDavid commented 2 years ago

@hecmay @chhzh123 I've added the general Stratus HLS backend tests to the CI tests, and now I would like to request a re-review for this PR. This PR has a basic SystemC backend support, where the input and output interfaces are implemented as serialized P2P ports.

I realize that for multi-dim array arguments, implementing them as memory-mapped interfaces (off-chip memories) is a better approach. But, as this PR has some general fixes such as nested hcl.select, issue https://github.com/cornell-zhang/heterocl/issues/386 incorrectly removed cast node, I think we should merge this sooner, and I'll open another PR for systemc backend enhancement.

hecmay commented 2 years ago

Thanks @zzzDavid. I will do another round code review.

But, as this PR has some general fixes such as nested hcl.select, issue https://github.com/cornell-zhang/heterocl/issues/386 incorrectly removed cast node

Ideally we need to break down these fixes into separate PRs for sake of better code management. Do you think it is possible to separate these fixes from this PR? It is okay if these are tightly coupled and cannot be easily taken apart. Please ensure you add regression tests for those bugs in the issues folder

hecmay commented 2 years ago

Idk for some reason i cannot add comments to the review. github keeps complaining that the diff is outdated when I insert comment in a file.

zzzDavid commented 2 years ago

@hecmay @chhzh123 May I request another round of review?

zzzDavid commented 2 years ago

Since this PR contains the SystemC backend and many other fixes, we decide it's better separated into a few PR each for one specific purpose. Therefore, I'm closing this PR, but the branch stays for future reference.