Referred to Hongzheng's PR. We need to handle two cases.
dataflow optimization inside function body
# Apply .to to place data
s.to(kernel.conv, kernel.out, depth=1)
s.to(image, kernel.out, depth=20)
Mark function body as dataflow region
top = s.subgraph()[0]
s[top].dataflow()
2. dataflow optimization inside a loop
```python
loop = hcl.compute((10,32), lambda *args: pe(args), "loop")
# Mark 1st loop's body as dataflow region
s[loop].dataflow(axis=1)
Referred to Hongzheng's PR. We need to handle two cases.
Mark function body as dataflow region
top = s.subgraph()[0] s[top].dataflow()