DeadCodeProductions / dead

Other
50 stars 5 forks source link

marker functions break data flow analysis #30

Open xukl opened 2 years ago

xukl commented 2 years ago

Hello, I notice the footnote 2 in page 5 of your paper which says

We note that the inserted optimization markers may impact how a compiler optimizes the instrumented code vs. the original uninstrumented code, but this does not affect the effectiveness of our technique.

The idea of your technique is effective of course, but I believe the method of inserting marker functions has some weakness; it may invalidate some DCE chances, for example, in the following C code.

extern void DCEMarker1();
extern void DCEMarker2();
int g;
int main()
{
    g = 5;
    if (g == 5) {
        DCEMarker1();
        if (g != 5) {
            DCEMarker2();
        }
    }
}

Compilers cannot determine whether g is modified in DCEMarker1(), thus the second if cannot be DCE'd. In other words, the inserted marker function breaks down the data flow analysis.

So I think assigning a volatile variable with magic numbers may be a better marker than marker functions, as influence of a variable is limited in its own scope. Or maybe, if we can tolerant going a little bit off the C standard, inline assembly is probably even better, since it's easier to filter out their assembly code than volatile variable's code.

#if TYPE == 1
#  define MARK1 DCEMarker1()
#  define MARK2 DCEMarker2()
#elif TYPE == 2
#  define MARK1 { volatile int _ = 0xbadbeef1; }
#  define MARK2 { volatile int _ = 0xbadbeef2; }
#elif TYPE == 3
#  define MARK1 asm volatile ("movabs 0xbadbeef1, %0" : "=r"(junk))
#  define MARK2 asm volatile ("movabs 0xbadbeef2, %0" : "=r"(junk))
#elif TYPE == 4
#  define MARK1 asm volatile ("# mark1" : )
#  define MARK2 asm volatile ("# mark2" : )
#else
#  define MARK1 /* empty */
#  define MARK2 /* empty */
#endif
extern void DCEMarker1();
extern void DCEMarker2();
int g, junk;
int main()
{
    g = 5;
    if (g == 5) {
        MARK1;
        if (g != 5) {
            MARK2;
        }
    }
}

TYPE 3 solution uses movabs instruction so is not portable among ISAs. TYPE 4 may be better, since it's just a comment in assembly code (I haven't investigated assembly in other ISAs, but comments should be similar, I think).

thetheodor commented 2 years ago

@xukl thanks for the suggestions!

We have been discussing lately the exact issue that you described, we'll try these alternative markers and test if they work better