panda-re / lava

LAVA: Large-scale Automated Vulnerability Addition
Other
371 stars 60 forks source link

Lava4Java - support for other languages? #4

Open kbroughton opened 6 years ago

kbroughton commented 6 years ago

Could someone describe at a high level what would be required to support java or C#? Are you aware of anyone working on this?

moyix commented 6 years ago

I don't know of anyone working on this. At a high level the general approach (figure out where attacker-controlled data reaches in the program and then use that data to create bugs) should be applicable to Java or C#, but all of the machinery here is geared toward C/C++.

The basic ingredients are:

  1. A dynamic taint analysis system capable of tracking many labels (and sets of labels), so that it's possible to trace a use of some data back to specific file bytes.
  2. An analysis that identifies places in the code where that data could be used to cause bugs. In C/C++, we focus on uses of pointers because we want to cause memory safety bugs. In something like C# or Java you would want to identify other types of bugs that can be created.
  3. Some system for automatically rewriting source code to actually inject the bugs. In particular, you need to add code at the site of the DUA (see the paper for details on what that is) that copies the data somewhere, and then code at the attack point to retrieve that data, test it against some condition, and use it to trigger the bug. In the initial implementation of LAVA we used global variables to create the dataflow beteween the DUA and attack point, but these aren't available in Java/C#, so you'd have to devise some other means of getting the data from the DUA to the attack point. We have also (after the initial paper) implemented this by adding an extra parameter to functions.