sampsyo / quala

custom type systems for Clang
MIT License
96 stars 7 forks source link

Annotation info doesn't match with LLVM IR output #11

Closed EmmetZC closed 8 years ago

EmmetZC commented 8 years ago

Question One:

Source:

// I know this declaration should never exist, because of the meaning of NULLABLE.
// This is just for a test of function hasAnnotation()
NULLABLE int * foo = 0; 
*foo = 4;

IR:

%foo = alloca i32*, align 4
store i32* null, i32** %foo, align 4
%0 = load i32*, i32** %foo, align 4
store i32 4, i32* %0, align 4, !tyann !1
...
!1 = !{!"nullable", i8 0}

From IR we can see, Inst store i32 4, i32* %0, align 4, !tyann !1 has metadata tyann, but when I call function hasAnnotation() for this Inst,

MDNode *MD = I->getMetadata("tyann");

the value of MD is nullptr, which means it cannot find metadata for that Inst, why does the annotation info get lost?

Besides, I'm a little bit confused about the metadata i8 0 in !1 = !{!"nullable", i8 0}, I thought i8 0 should indicate the top level of Annotation, and i8 1 should mean the lower level, while I find out that the reality seems to be just the reverse.

Question Two:

In Nullable Codegen.cpp test, if we generate LL IR for test/codegen.c, we get this:

%0 = load i32*, i32** %foo, align 4, !tyann !2
%isnull = icmp eq i32* %0, null
call void @qualaNullCheck(i1 %isnull)
%1 = load i32, i32* %0, align 4

In Pass NullChecks, we know that this pass will addCheck() before annotated Store/Load inst. Obviously the pass adds check before %1 = load i32, i32* %0, align 4, which is not shown annoated. Does this means the annotation info is not properly shown is LL IR?

Is there anything misunderstood by me?

sampsyo commented 8 years ago

I think you've stumped me in both counts!

On question 1: I don't see an obvious reason why getMetadata wouldn't work in this case. Without digging deeper, I'm mystified.

2: I also don't exactly see why (a) that instruction would not be annotated, and (b) the pass would insert a check despite the missing annotation. Perhaps the annotation is getting lost at a later stage—can you inspect the IR before instrumentation?

EmmetZC commented 8 years ago

Unfortunately, your guess for question 2 seems wrong. To test this, I compile the test/codegen.c manually in 2 steps:

  1. Compile it to LLVM BC and LLVM IR without the NullChecks pass, and both LLVM IR and disassembled result of LLVM BC show that instruction is not annotated.
  2. Manually load NullChecks pass to that LLVM BC using opt and the pass finds the correct instruction to addCheck(). This test implies, perhaps the annotation info gets lost while printing LLVM IR. I need to check how instruction is dumped.
sampsyo commented 8 years ago

Very strange indeed! I've really never encountered a problem where the in-memory IR had data that was silently lost when dumping it to text or bitcode on disk.

EmmetZC commented 8 years ago

Hey, I guess I've found out the "culprit" for this mismatch. As you know, I'm new to this project so all my tests are based on your example, nullness. Check out NullChecks.cpp:

Value *Ptr = nullptr;
if (auto *LI = dyn_cast<LoadInst>(&I)) {
  Ptr = LI->getPointerOperand();  // line 34
} else if (auto *SI = dyn_cast<StoreInst>(&I)) {
  Ptr = SI->getPointerOperand(); // line 36
}
...
...
if (Ptr) {
  if (AI.hasAnnotation(Ptr, "nullable")) { // line 42
    addCheck(*Ptr, I);
    modified = true;
  }
}

From line 34/36 we can see that Ptr stores the first operand of StoreInst/LoadInst, instead of the Inst itself, and obviously the operand is not annotated when we declare it as low level annotated in Question 1. So now we have (a) the LLVM IR metadata shows the annotation info of the Instruction, while (b) the hasAnnotation() call checks the annotation info of the operand of the Instruction. This could explain Question 2, too.

sampsyo commented 8 years ago

Aha, of course—thanks for sorting out my own thinking for me. :flushed: The point, of course, is to prevent loads through null pointers, not to prevent loads that produce null results.