cornell-zhang / allo

Allo: A Programming Model for Composable Accelerator Design
https://cornell-zhang.github.io/allo
Apache License 2.0
134 stars 28 forks source link

[BUG] Vitis HLS backend results not written back to argument #154

Closed zzzDavid closed 1 month ago

zzzDavid commented 6 months ago

Describe the bug Vitis HLS backend generates buggy code, the results are not written back to the arguments.

To Reproduce

import allo
from allo.ir.types import int32

N = 256

def compute(
    x: int32[N],
    y: int32[N]
):
    for i in range(N):
        y[i] = x[i]

s = allo.customize(compute)
s.build(target="vitis_hls", mode="csim", project="test.prj")

Buggy output


//===------------------------------------------------------------*- C++ -*-===//
//
// Automatically generated file for High-level Synthesis (HLS).
//
//===----------------------------------------------------------------------===//
#include <algorithm>
#include <ap_axi_sdata.h>
#include <ap_fixed.h>
#include <ap_int.h>
#include <hls_math.h>
#include <hls_stream.h>
#include <math.h>
#include <stdint.h>
using namespace std;

extern "C" {

void compute(
  int32_t *v0,
  int32_t *v1
) { // L2
  #pragma HLS interface m_axi port=v0 offset=slave bundle=gmem0
  #pragma HLS interface m_axi port=v1 offset=slave bundle=gmem1
  int32_t buf0[256];    //
  l_S_buf0_buf0_l_0: for (int buf0_l_0 = 0; buf0_l_0 < 256; buf0_l_0++) {   //
  #pragma HLS pipeline II=1 rewind
    int32_t v4 = v0[buf0_l_0];  //
    buf0[buf0_l_0] = v4;    //
  }
  int32_t buf1[256];    //
  l_S_buf1_buf1_l_0: for (int buf1_l_0 = 0; buf1_l_0 < 256; buf1_l_0++) {   //
  #pragma HLS pipeline II=1 rewind
    int32_t v7 = v1[buf1_l_0];  //
    buf1[buf1_l_0] = v7;    //
  }
  l_S_i_0_i: for (int i = 0; i < 256; i++) {    // L3
    int32_t v9 = buf0[i];   // L4
    buf1[i] = v9;   // L5
  }
}

} // extern "C"

buf1 is the result buffer, but it was not written back to v1, therefore the entire kernel is considered dead code.

Expected behavior Result tensor passed in as arguments should be written back to

Additional context This is related to the interface requirement of Vitis, needs further consideration not to affect existing systolic array examples