wala / WALA

T.J. Watson Libraries for Analysis, with frontends for Java, Android, and JavaScript, and may common static program analyses
http://github.com/wala/WALA
Eclipse Public License 2.0
765 stars 223 forks source link

Clarification Needed on Nodes in JavaScript Call Graph Using Field-Based Algorithm #1375

Open flyboss opened 8 months ago

flyboss commented 8 months ago

Issue Description:

I am currently facing a challenge while generating a call graph for a JavaScript file using WALA (the T.J. Watson Libraries for Analysis). In the generated call graph, I observe the inclusion of certain nodes that do not correspond to call sites, and these nodes do not have any callees associated with them.

Test JavaScript Code:

var a = [];
var b = {};

Java code

import com.ibm.wala.cast.js.callgraph.fieldbased.FieldBasedCallGraphBuilder;
import com.ibm.wala.cast.js.translator.CAstRhinoTranslatorFactory;
import com.ibm.wala.cast.js.util.CallGraph2JSON;
import com.ibm.wala.cast.js.util.FieldBasedCGUtil;
import com.ibm.wala.ipa.callgraph.CallGraph;
import com.ibm.wala.util.CancelException;
import com.ibm.wala.util.NullProgressMonitor;
import com.ibm.wala.util.WalaException;

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;

public class tmp {
    public static void main(String[] args) throws WalaException, CancelException, IOException {
        Path path = Paths.get("hello.js");
        FieldBasedCGUtil f = new FieldBasedCGUtil(new CAstRhinoTranslatorFactory());
        FieldBasedCallGraphBuilder.CallGraphResult results = f.buildScriptDirCG(path, FieldBasedCGUtil.BuilderType.OPTIMISTIC_WORKLIST,new NullProgressMonitor(),false);
        CallGraph CG = results.getCallGraph();

        try {
            FileWriter myWriter1 = new FileWriter(new File("SCG_OPT.json"));
            myWriter1.write((new CallGraph2JSON(false,true)).serialize(CG));
            myWriter1.close();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
}

Output using field-based(OPTIMISTIC_WORKLIST) Algorithm:

{
  "hello.js@1:0-23": {
    "hello.js@1:8-10": [],
    "hello.js@2:20-22": []
  }
}

Inquiry:

Is there a way to filter out or exclude these non-call-site nodes from the call graph? These nodes, which do not represent function invocations, are adding unnecessary complexity to the call graph and do not serve any purpose for my analysis. I am looking for advice on how to adjust the settings or modify WALA to prevent these nodes from being included in the call graph. Any recommendations or insights on this matter would be highly appreciated.

Thank you for your assistance!

msridhar commented 6 months ago

Hi @flyboss very sorry for my slow response. These call edges correspond to the array and object allocation occurring at line 1 and line 2 respectively. You might be right that in these cases, technically no functions are called (I'd have to read the ECMAScript spec carefully to be sure). I think what's going on is that for uniformity, WALA models these literal expressions in its IR as calls to new Array and new Object. If you specifically want to filter out such calls corresponding to literals, my best suggestion would be to look for the Object and Array constructor methods as the target in the call graph, and then check the corresponding source location to see if it's a literal. Is that helpful? Sorry again for the delayed response.