Compilation from C and optimization of sequences of bnes + jmp

drosenborg commented 1 year ago

Hi,

Just started exploring MIR and I find it very impressive!

I'm currently building simple filters from a higher level language that I translate into C and have come across case where I can't get mir to generate the code I ideally want. The C code looks like this with a rather long (about 100) sequence of if statements:

if (A == 5609) goto L498;
if (A == 6200) goto L498;
if (A == 6202) goto L498;
...

c2mir will turn this into:

L73:
        bnes    L71, u0_A, 5609
L70:
        jmp     L43
        jmp     L72
L71:
L72:
L77:
        bnes    L75, u0_A, 6200
L74:
        jmp     L43
        jmp     L76
L75:
L76:
L81:
        bnes    L79, u0_A, 6202
L78:
        jmp     L43
        jmp     L80
L79:
L80:
...

The problem here is that in my view it is reversing the logic from eq to ne and that requires it to insert an extra jmp per step. If I by hand restore it to mimic my original intent by replacing bnes with beqs, the generated filter runs about twice as fast:

       beqs     L43, u0_A, 5609
       beqs     L43, u0_A, 6200
       beqs     L43, u0_A, 6202
       ...

I discovered this "anomaly" when I out of curiosity compiled the C code with gcc and got double the performance when using the filter, and when I compared the generated machine code I realised that gcc had retained the fall-through style for the false branch by using just 'je' instead of 'jne + jmpq' as mir does.

So my question is if you know of a way that I can rearrange the C-code so that the result becomes closer to what I want, or if this is something that can be improved in c2mir.

Thanks

/David

drosenborg commented 1 year ago

Actually, it looks like if I reverse the conditional expression and nest the if statements, I get closer to what I want, just hoping deep nesting won't be a problem ...

   if (A != 5609)
      if (A != 6200)
         if (A != 6202)
   ...

vnmakarov commented 1 year ago

Thank you for reporting this.

Strange, I am having a bit different code for the first test:

        bnes    L0, hr0, 5609 # indexes: _,_,_
        jmp     L3 # indexes: _
L0 # indexes:
        bnes    L1, hr0, 6200 # indexes: _,_,_
        jmp     L2 # indexes: _
L1 # indexes:
        bnes    L2, hr0, 6202 # indexes: _,_,_ # dead: hr0
        jmp     L4 # indexes: _

L2 # indexes:
        mov     hr0, 2 # indexes: _,_
        jmp     L1 # indexes: _

L3 # indexes:
        mov     hr0, 3 # indexes: _,_
        jmp     L1 # indexes: _

L4 # indexes:

Still the code is bad. I'll look what can I do.

vnmakarov / mir

Compilation from C and optimization of sequences of bnes + jmp #315