[WIP][Util] Auto-scheduling and auto-quantization for HeteroCL

In this PR, we integrate auto-scheduling and auto-tuning utilities for HeteroCL infrastructure. The auto-scheduling is conducted based on analytical performance model, we only support automatic data placement and data reuse in the initial version. The auto-tuning is developed based on UpTune, and used for data auto-quantization.

An example of performing auto-scheduling in a two-layer NN written in HeteroCL. The analysis function will extract the potential data reusability between HeteroCL stages (e.g. the CONV and POOL layer in the example) and apply reuse_at schedule automatically on the schedule.

def build_nn(input_image, weight_conv1, weight_fc1):

    conv1 = hlib.op.nn.conv2d_nchw(input_image, weight_conv1)
    pool1 = hlib.op.nn.max_pool(tanh1, kernel=(2,2), stride=(2,2))

    flat = hlib.op.nn.flatten(pool1)
    fc1 = hlib.op.nn.dense(flat, weight_fc1)
    return softmax(lenet, fc1)

s = hcl.create_schedule([...inputs], build_nn)

# perform auto-scheduling targeting specific platforms
hcl.autosch(s, build_nn, target)

cornell-zhang / heterocl

[WIP][Util] Auto-scheduling and auto-quantization for HeteroCL #165