Closed Siberia-yuan closed 1 year ago
Hi Yuan,
I will always reply. Don't hesitate to ask. The tool might not be perfect and my knowledge might not be enough but I will try my best to help.
Yes, you are right. Data-flow graph might not always be a DAG, especially when there is inter-iteration dependency (i.e., loop-carry dependency). To answer your question, when to sort the DFG, we need always pick the entry point for each recurrence loop. If you look at the IR generated from LLVM, there is always an ordering before we construct the corresponding DFG. So the sorting should refer to the original IR ordering whenever topological sorting hangs. Hope this helps.
FYI about other optimization techniques could be potentially applied: Every for
loop has at least one recurrence loop in the DFG since there are phi
and br
, but some papers (e.g., FPGA HLS and CGRA papers) eliminate such recurrence loop by fusing those nodes into single node, which (maybe) increases the complexity of hardware but easier for mapping.
another question, edges out from phi node should noted with color blue, right? I think it should be noted as a control flow edges instead of data flow
Conventionally, a control-flow edge points to the entry of a basic block. And all the other edges within a basic block are data-flow edges. The blue edges are just used to show the boundary of basic blocks. And the out edges of phi
are data-flow edges.
In addition, the phi
node does carry computation data and the predicate bit. Predication technique supported by the phi
node transforms the control-flow to data-flow.
still, I have another question. By looking into the dot graph of dfg generated by llvm pass, I found it hard to make the dfg completely compatible with the kernel source code // target FIR kernel for (i = 0; i < NTAPS; ++i) { sum += input[i] * coefficient[i]; } op node 1, looks a bit addtional. when i try to delete the op node 1 ,the graph seems completely compatible with the kernel code, the modified dfg look like this:
Hi Yuan,
The DFG is constructed based on the LLVM IR, it is what it is, I didn't change/modify anything at this step. What I did is just visualize it. Mapper just takes the DFG as input.
To answer your question, it is LLVM's decision to generate the two phi
nodes. The node 1
is used to accumulate the sum
and the node 0
is used to accumulate the i
. To better understand how the phi
node works, you can use some simple kernel, compile it using LLVM, and look into the generated IR (.bc
->.ll
).
Appreciate your patience and answer, it helped me a lot! No more questions, for now.....(sorry)
No problem at all. Feel free to post any question. I am happy to help.
According to the OpenCGRA's papper,the mappers takes the iterative modulo algorithm for mapping. The dfg nodes should aligned in an topological order, however edges out from Phi nodes contruct a Recurrence Loop, which makes sorting topological order impossible since the whole graph doesn't belong to a DAG. I would appreciate it very much if you can reply !