tancheng / CGRA-Mapper

An LLVM pass that can generate CDFG and map the target loops onto a parameterizable CGRA.
BSD 3-Clause "New" or "Revised" License
57 stars 8 forks source link

isMAC() fixed #15

Closed MeowMJ closed 9 months ago

MeowMJ commented 9 months ago

After combineMulAdd(), the opcodeName changes. So isMAC() detects new opcodeName to verify whether it is MAC and isMAC() won't always return true.

tancheng commented 9 months ago

Which fiction fuse mul with add into new muladd?

MeowMJ commented 9 months ago

Function combineMulAdd(). But it's not used in this mapper in fact.

tancheng commented 9 months ago

Function combineMulAdd(). But it's not used in this mapper in fact.

Can you please apply that combineMulAdd() on a fir kernel and show the two DFGs (before and after) here?

MeowMJ commented 9 months ago

Function combineMulAdd(). But it's not used in this mapper in fact.

Can you please apply that combineMulAdd() on a fir kernel and show the two DFGs (before and after) here?

Fir kernel doesn't contain a mul and add that can be combined, So I show inttCore kernel (apply combine("mul","sub") before and after) instead.

I notice that combineMulAdd() is not as useful as combine("mul","add") or combine("mul","sub") or others, because it only combines Critical dfgNode.

Note that (19)mul and (21)sub are combined in InttAfterFuse.png and become (19)mulsub.

InttBeforeFuse.png InttBeforeFuse InttAfterFuse.png InttAfterFuse

tancheng commented 9 months ago
MeowMJ commented 9 months ago
tancheng commented 9 months ago
MeowMJ commented 9 months ago

I will check the combine function later to answer your question, Sir. Right now, I have a busy time struggling with my relatives QAQ.

MeowMJ commented 9 months ago
  • Why 19 and 21 are combined rather than 20 and 21? Which pair has the higher priority to be combined in your opinion?
  • FIR kernel has a fmul followed by fadd. So that pair can also be combined, right?
  • The main purpose of this Pull Request (PR) is that when users enable MAC (or FMAC) in the GUI, all (or some of) the mul and add (or fmul and fadd) need to be combined.

Below is Fir before and after combineMulAdd()(the one I remove judgment of critical nodes). Note that (6)mul and (7)add are combined and become (6)fmuladd.

firBeforeFuse.png firBeforeFuse firAfterFuse.png firAfterFuse

tancheng commented 9 months ago
MeowMJ commented 9 months ago
tancheng commented 9 months ago

Nice.

MeowMJ commented 9 months ago

So the purpose of combineForIter() is to combine any kind of critical path, no matter it's add/compare/br/phi or add/phi (those two are the most common ones but there exit other critical paths)? In that way, I need to change the logic of combineForIter().

I also want to know if this task is relevant to the mail you sent to us (Help needed ...) before or not.

tancheng commented 9 months ago

Yes, we want to fuse all the nodes within the critical cycle. However, we also want to make it parameterizable, for example, some cases, we may not want to over fuse them. So we can start from the above two cases I guess?

Yes, this is related to the mail. Fusion is a critical step towards better mapping.

MeowMJ commented 9 months ago

OK, I will try to leave a parameter interface in combineForIter() and try to provide some tests for my modifications.

MeowMJ commented 9 months ago

I made it in a recursive way, sir. But I want to confirm some questions.

tancheng commented 9 months ago

Cool, great progress. Recursive way is very popular for pattern recognition. I am fine with either recursive or iterative. It would be perfect to also fuse the pattern in the non-critical paths. But if you think that is not trivial engineering effort, we can leave it to another PR.

MeowMJ commented 9 months ago

DFG::combineIter() and DFG::combineForIter() is made to fuse the pattern in DFGs regardless the order or size of the pattern. I update the codes and you can merge the PR if there is no question. Here is an example of DFG of fir after fusion where the pattern is "add -cmp-br-phi" and "phi-add". firAfterFuse

MeowMJ commented 9 months ago

combineForIter() is to accept fusion pattern that user provides and it will call combineIter(). combineIter() is a recursive function that fuse patterns.

Here is an example of code of fusion pattern "add -cmp-br-phi" and "phi-add".

    list<string>* combinePattern1 = new list<string>[4]; 
    combinePattern1 -> push_back("add");
    combinePattern1 -> push_back("icmp");
    combinePattern1 -> push_back("br");
    combinePattern1 -> push_back("phi");
    list<DFGNode*>* cPatternNodes1 = new list<DFGNode*>[4]; 
    combineForIter(combinePattern1, cPatternNodes1);

    list<string>* combinePattern2 = new list<string>[2]; 
    combinePattern2 -> push_back("phi");
    combinePattern2 -> push_back("fadd");
    list<DFGNode*>* cPatternNodes2 = new list<DFGNode*>[2]; 
    combineForIter(combinePattern2, cPatternNodes2);
MeowMJ commented 9 months ago

I update combineForIter() and combineIter() into one function and this is an example code of how to use this function. The result DFG is the same as before.

    string* combinePattern1 = new string[4];
    combinePattern1[0] = "add";
    combinePattern1[1] = "icmp";
    combinePattern1[2] = "br";
    combinePattern1[3] = "phi";
    list<DFGNode*>* cPatternNodes1 = new list<DFGNode*>[4]; 
    combineForIter(combinePattern1, 4, cPatternNodes1);

    string* combinePattern2 = new string[2]; 
    combinePattern2[0] = ("phi");
    combinePattern2[1] = ("fadd");
    list<DFGNode*>* cPatternNodes2 = new list<DFGNode*>[2]; 
    combineForIter(combinePattern2, 2, cPatternNodes2);
MeowMJ commented 9 months ago

I update combineForIter() with one parameter and fix the name issue. The result DFG is the same as before.

    list<string>* combinePattern1 = new list<string>[4]; 
    combinePattern1 -> push_back("add");
    combinePattern1 -> push_back("icmp");
    combinePattern1 -> push_back("br");
    combinePattern1 -> push_back("phi");
    combineForIter(combinePattern1);

    list<string>* combinePattern2 = new list<string>[2]; 
    combinePattern2 -> push_back("phi");
    combinePattern2 -> push_back("fadd");
    combineForIter(combinePattern2);