correctcomputation / checkedc-clang

This is the primary development repository for 3C, a tool for automatically converting legacy C code to the Checked C extension of C, which aims to enforce spatial memory safety. This repository is a fork of Checked C's.
14 stars 5 forks source link

Unneeded Cast Placement #134

Closed aaronjeline closed 4 years ago

aaronjeline commented 4 years ago

Consider the following

void vsf_exit(const char *p_text);
void foo(void) {
  vsf_exit("this is a string");
}

Is converted into:

void vsf_exit(const char *p_text);
void foo(void) _Checked{
  vsf_exit(((const char *)"this is a string"));
}

I believe that since the string literal is turned into a checked type, CastPlacementVisitor decides the cast is necessary.

mwhicks1 commented 4 years ago

@jackastner I was thinking that adding a lower bound on string literals might fix this problem. Can you investigate?

john-h-kastner commented 4 years ago

I tried that when I was working on that other issue, and just changing the bound direction isn't enough to change anything.

With the lower bound, we get this output:

>>vsf_exit:p_text_0 = [Checked=WILD, PtrType=PTR]
str_1 = [Checked=WILD, PtrType=PTR]

Without it we get:

>>vsf_exit:p_text_0 = [Checked=WILD, PtrType=PTR]
str_1 = [Checked=WILD, PtrType=NTARR]

so using a lower bound, the string variable is made PTR rather NTARR, but this doesn't effect cast placement.

mwhicks1 commented 4 years ago

Maybe the issue is that you have to introduce a fresh VarAtom, and bound that, in the same way that we do for & ?

I am guessing that the argument to vsf_exit has a char * argument, but the string literal is being treated as checked, i.e., _Nt_Array_Ptr. Then the cast placement thing is kicking in. Maybe by using a fresh VarAtom for each string literal creates the possibility that its solution is WILD and so the cast visitor doesn't think it needs to add a cast?

john-h-kastner commented 4 years ago

I think I've finally gotten to the bottom of this. The problem comes from this call to getExprConstraintVars by CastPlacementVisitor where A is one of the arguments for a function call. https://github.com/plum-umd/checkedc-clang/blob/b4b3171f1acd130ddaa6231e70caa3dd4c0ffaf8/clang/lib/CConv/CastPlacement.cpp#L60-L61

This recomputes the constraint variables for the arguments to a function when deciding if there should be a cast. The problem is that getExprConstraintVars creates new constraint variables, so the constraint variables returned here are not the same variables that were used when solving the checked constraint graph.

Later on, the visitor checks if the constraint variable solutions for the arguments are checked or wild (a checked argument for wild parameter triggers cast insertion). Because the constraint variable was not used during solving, it still has it's initial value (checked), and the cast is always inserted if the parameter is wild.

To fix this, we need to avoid recomputing the constraint variables during a second call to getExprConstraintVars. This is already partially done with ProgramInfo storing constraints for variable declarations. I think this needs to be extended to store every constraint variable that is generated.

mwhicks1 commented 4 years ago

This makes a lot of sense. I thought that at some level this was already happening in getExprConstraintVars but maybe only when it's generating "temporary" constraint variables? I thought that it was looking in some map to see if a variable had been generated before.

john-h-kastner commented 4 years ago

There's a map that remembers constraints generated for variables, so variables are handled correctly. In fact, adding an intermediate variable gets rid of the extra cast.

void foo(char *bar);
void baz() {
  char *buz = "test";
  foo(buz);
}

Strings (and a few other cases) generate constraint variables without touching this map.

mwhicks1 commented 4 years ago

Are you referring to the Variables map in ProgramInfo or something else? Assuming "something else," then you are suggesting that that map should be used for this purpose too?

john-h-kastner commented 4 years ago

I was referring to Variables, but there is a second map ExprTmpConstraints in ConstraintResolver.cpp that might actually be more appropriate. I'm not quite clear on what that map is doing, but it seems to be what you were talking about with temporary constraints.

mwhicks1 commented 4 years ago

@Machiry added that one. I think it was meant for what you are describing and perhaps it needs to be extended for holding literals too.