pascal-lab / Tai-e

An easy-to-learn/use static analysis framework for Java
https://tai-e.pascal-lab.net/docs/index.html
GNU Lesser General Public License v3.0
1.47k stars 177 forks source link

In the call graph, the call edges related to dynamic proxy are missing. #123

Open YunFy26 opened 1 month ago

YunFy26 commented 1 month ago

šŸ“ Overall Description

### For the following demo `Service.java` ```java public interface Service { void doSomething(); } ``` `ServiceImpl.java` ```java public class ServiceImpl implements Service { @Override public void doSomething() { System.out.println("Performing task in ServiceImpl..."); } } ``` `MyInvocationHandler.java` ```java public class MyInvocationHandler implements InvocationHandler { private final Object target; public MyInvocationHandler(Object target) { this.target = target; } @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { System.out.println("before method call..."); // method invoke Object result = method.invoke(target, args); System.out.println("after method call..."); return result; } public static Object getProxy(Object target) { return Proxy.newProxyInstance( target.getClass().getClassLoader(), target.getClass().getInterfaces(), new MyInvocationHandler(target) ); } } ``` `Main.java` ```java public class Main { public static void main(String[] args) { ServiceImpl service = new ServiceImpl(); Service proxy = (Service) MyInvocationHandler.getProxy(service); proxy.doSomething(); } } ``` `IR` of `Main.java` ```java public static void main(java.lang.String[] r3) { org.example.proxy.ServiceImpl $r0; java.lang.Object $r1; org.example.proxy.Service r2; [0@L10] $r0 = new org.example.proxy.ServiceImpl; [1@L10] invokespecial $r0.()>(); [2@L11] $r1 = invokestatic ($r0); [3@L11] r2 = (org.example.proxy.Service) $r1; [4@L12] invokeinterface r2.(); [5@L13] return; } ```

The call-graph as follows:

digraph G {
  node [color=".3 .2 1.0",shape=box,style=filled];
  edge [];
  "0" [label="<java.lang.Class: java.lang.Class[] getInterfaces()>",];
  "1" [label="<java.lang.Class: java.lang.ClassLoader getClassLoader()>",];
  "2" [label="<org.example.proxy.MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>",];
  "3" [label="<java.lang.Object: java.lang.Class getClass()>",];
  "4" [label="<org.example.proxy.MyInvocationHandler: void <init>(java.lang.Object)>",];
  "5" [label="<org.example.proxy.ServiceImpl: void <init>()>",];
  "6" [label="<java.lang.Object: void <init>()>",];
  "7" [label="<org.example.Main: void main(java.lang.String[])>",];
  "8" [label="<java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>",];
  "2" -> "8" [label="[6@L27] $r6 = invokestatic <java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>($r2, $r4, $r5);",];
  "2" -> "4" [label="[5@L29] invokespecial $r5.<org.example.proxy.MyInvocationHandler: void <init>(java.lang.Object)>(r0);",];
  "2" -> "3" [label="[0@L28] $r1 = invokevirtual r0.<java.lang.Object: java.lang.Class getClass()>();",];
  "2" -> "1" [label="[1@L28] $r2 = invokevirtual $r1.<java.lang.Class: java.lang.ClassLoader getClassLoader()>();",];
  "2" -> "0" [label="[3@L29] $r4 = invokevirtual $r3.<java.lang.Class: java.lang.Class[] getInterfaces()>();",];
  "2" -> "3" [label="[2@L29] $r3 = invokevirtual r0.<java.lang.Object: java.lang.Class getClass()>();",];
  "4" -> "6" [label="[0@L12] invokespecial %this.<java.lang.Object: void <init>()>();",];
  "5" -> "6" [label="[0@L3] invokespecial %this.<java.lang.Object: void <init>()>();",];
  "7" -> "2" [label="[2@L11] $r1 = invokestatic <org.example.proxy.MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>($r0);",];
  "7" -> "5" [label="[1@L10] invokespecial $r0.<org.example.proxy.ServiceImpl: void <init>()>();",];
}

The call edge main ā†’ doSomething is missing.

In the actual runtime call sequence, before doSomething is called, the method invoke of MyInvocationHandler will be called, and then doSomething is called through reflection within the invoke method.

After completing the pointer analysis, I reviewed the results of the analysis.

solver.csManager.callSites includesļ¼š

<org.example.Main: void main(java.lang.String[])>[4@L12] invokeinterface r2.doSomething()

solver.csManager.ptrManager.vars.map includes var r2 ļ¼Œbut the pointsToSet of r2 is null , As shown in Figure-1

At runtime, the type of r2 is jdk.proxy1.$Proxy0

public static void main(String[] args) {
        ServiceImpl service = new ServiceImpl();
        Service proxy = (Service) MyInvocationHandler.getProxy(service);
        System.out.println(proxy.getClass());   //class jdk.proxy1.$Proxy0
        proxy.doSomething();
    }

Since $Proxy0 is generated at runtime, Tai-e is unable to identify the allocation site for this object. So there is no Object mocked, which results in the missing call edge. Is my understanding correct?

According to #114 ļ¼š

Regarding mocking IR, Tai-e currently supports mocking IR within method at the statement level but does not support mocking an entire class. We will take this into consideration in the future.

Does this imply that Tai-e does not yet natively support method calls in dynamic proxy? If Tai-e supports handling method calls within proxy classes, what configurations should I modify?

Moreover, I have observed that solver.csManager.objManager.objMap contains:ļ¼ˆas shown in Figure-2ļ¼‰

{ConstantObj@5877} "ConstantObj{java.lang.Class: org.example.proxy.ServiceImpl.class}" -> {HybridHashMap@5878}  size = 1

Why is org.example.proxy.ServiceImpl.classconsidered a ConstantObj?



Additionally, in tai-e-analyses.yml , I set the value of handle-invokedynamic to true. Tai-e output the IR of $Proxy:

public final class jdk.proxy1.$Proxy0 extends java.lang.reflect.Proxy implements org.example.proxy.Service {

    ...

    public final void doSomething() {
        java.lang.reflect.InvocationHandler $r2;
        java.lang.reflect.Method $r1;
        null-type %nullconst;
        java.lang.Throwable $r5, $r3;
        java.lang.reflect.UndeclaredThrowableException $r4;
        [0@L-1] $r2 = %this.<java.lang.reflect.Proxy: java.lang.reflect.InvocationHandler h>;
        [1@L-1] $r1 = <jdk.proxy1.$Proxy0: java.lang.reflect.Method m3>;
        [2@L-1] invokeinterface $r2.<java.lang.reflect.InvocationHandler: java.lang.Object invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[])>(%this, $r1, %nullconst);
        [3@L-1] return;
        [4@L-1] catch $r5;
        [5@L-1] throw $r5;
        [6@L-1] catch $r3;
        [7@L-1] $r4 = new java.lang.reflect.UndeclaredThrowableException;
        [8@L-1] invokespecial $r4.<java.lang.reflect.UndeclaredThrowableException: void <init>(java.lang.Throwable)>($r3);
        [9@L-1] throw $r4;

        try [0, 4), catch java.lang.Error at 4
        try [0, 4), catch java.lang.RuntimeException at 4
        try [0, 4), catch java.lang.Throwable at 6
    }

    ...

}

I have a few questions regarding this IR. Could you explain why the line number is shown as -1?

šŸŽÆ Expected Behavior

None

šŸ› Current Behavior

None

šŸ”„ Reproducible Example

No response

āš™ļø Tai-e Arguments

šŸ” Click here to see Tai-e Options ```yaml optionsFile: null printHelp: false classPath: - ../Tai-e_Test/build/classes/java/main appClassPath: - ../Tai-e_Test/build/classes/java/main mainClass: org.example.Main inputClasses: [] javaVersion: 17 prependJVM: true allowPhantom: true worldBuilderClass: pascal.taie.frontend.soot.SootWorldBuilder outputDir: output preBuildIR: false worldCacheMode: false scope: APP nativeModel: true planFile: null analyses: ir-dumper: "" cg: "" cfg: "" pta: "plugins:[pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin]" onlyGenPlan: false keepResult: - $KEEP-ALL ```
šŸ” Click here to see Tai-e Analysis Plan ```yaml - id: ir-dumper options: {} - id: pta options: cs: 1-obj only-app: true implicit-entries: false distinguish-string-constants: reflection merge-string-objects: true merge-string-builders: true merge-exception-objects: true handle-invokedynamic: true propagate-types: - reference advanced: null dump: false dump-ci: false dump-yaml: false expected-file: null reflection-inference: string-constant reflection-log: null taint-config: null taint-config-providers: [] taint-interactive-mode: false plugins: - pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin time-limit: -1 - id: cg options: algorithm: pta dump: true dump-methods: true dump-call-edges: true - id: throw options: exception: explicit algorithm: intra - id: cfg options: exception: explicit dump: true ```

šŸ“œ Tai-e Log

šŸ” Click here to see Tai-e Log ``` Writing log to /Users/yuntsy/My/Projects/Java/Tai-e/output/tai-e.log java.version: 17.0.11 java.version.date: 2024-04-16 java.runtime.version: 17.0.11+7-LTS-207 java.vendor: Oracle Corporation java.vendor.version: null os.name: Mac OS X os.version: 15.0.1 os.arch: aarch64 Tai-e Version: 0.5.1-SNAPSHOT Tai-e Commit: 46448829b6c19ae414caea7b43bd7fb8792ac0a5 Writing analysis plan to /Users/yuntsy/My/Projects/Java/Tai-e/output/tai-e-plan.yml WorldBuilder starts ... 10085 classes with 99482 methods in the world WorldBuilder finishes, elapsed time: 1.62s ir-dumper starts ... Dumping IR in /Users/yuntsy/My/Projects/Java/Tai-e/output/tir 5 classes in scope (APP) of class analyses ir-dumper finishes, elapsed time: 0.03s pta starts ... [Pointer analysis] elapsed time: 0.01s -------------- Pointer analysis statistics: -------------- #var pointers: 12 (insens) / 12 (sens) #objects: 5 (insens) / 5 (sens) #var points-to: 9 (insens) / 9 (sens) #static field points-to: 0 (sens) #instance field points-to: 1 (sens) #array points-to: 1 (sens) #reachable methods: 9 (insens) / 10 (sens) #call graph edges: 10 (insens) / 10 (sens) ---------------------------------------- pta finishes, elapsed time: 0.11s cg starts ... Call graph has 9 reachable methods and 10 edges Dumping call graph to /Users/yuntsy/My/Projects/Java/Tai-e/output/call-graph.dot Dumping reachable methods to /Users/yuntsy/My/Projects/Java/Tai-e/output/reachable-methods.txt Dumping call edges to /Users/yuntsy/My/Projects/Java/Tai-e/output/call-edges.txt cg finishes, elapsed time: 0.01s throw starts ... 14 methods in scope (APP) of method analyses throw finishes, elapsed time: 0.00s cfg starts ... Dumping CFGs in /Users/yuntsy/My/Projects/Java/Tai-e/output/cfg cfg finishes, elapsed time: 0.01s Tai-e finishes, elapsed time: 1.88s ```

ā„¹ļø Additional Information

No response

zhangt2333 commented 1 month ago

Thank you for taking the time to provide such detailed information. This seems to be a rather important issue, we'll take the time to look into it after being free.

Before we investigate this issue further, we would like to conduct a user study to understand your experience with our GitHub Issue Template. Specifically, we want to determine if there are any organizational, descriptive or structural aspects of the template that make it difficult/undesirable for you to follow when submitting an issue.

YunFy26 commented 1 month ago

I apologize for not strictly adhering to the issue template format when submitting my issue. Iā€™d like to explain the reason behind this.

When describing my example in the Overall Description, whether itā€™s for this issue or previous ones, I find it difficult to separate the Expected Behavior and Current Behavior from the Overall Description. When describing the issue, I always feel that placing Expected Behavior and Current Behavior as separate headings after the Overall Description creates a sense of ā€œdisconnection.ā€ It feels like it disrupts the flow of the explanation.

Taking this submission as an example, I want to analyze the function calls related to dynamic proxies. I first provided a brief description in the title: ā€œcall edges related to dynamic proxy are missing.ā€ Then, in the Overall Description, I started by offering a demo as a sample for analysis.

ā‘ Demo

Afterward, I presented the resulting call graph and explained the outcome of this analysis.

ā‘”The call edgeĀ mainĀ ā†’Ā doSomethingĀ is missing.

Next, I described the actual runtime call sequence:

ā‘¢In the actual runtime call sequence, beforeĀ doSomethingĀ is called, the methodĀ invokeĀ ofĀ MyInvocationHandlerĀ will be called, and thenĀ doSomethingĀ is called through reflection within theĀ invokeĀ method.

In this process:

ā‘  is the Reproducible Example

ā‘” is the Current Behavior

ā‘¢ is the Expected Behavior(Perhaps I didnā€™t describe it clearly enough. I should have included a call chain like: main -> invoke -> doSomething as Expected Behavior.)

If I strictly followed the template, the structure would probably look like this: I would first describe the issue in the Overall Description, then follow with either a ā‘¢ā‘”ā‘  or ā‘”ā‘¢ā‘  format.

Personally, I believe that describing the entire process directly in the Overall Description makes it easier to follow and understand. Therefore, I placed everything in the Description section. In this case, if I were to follow the template strictly, it would result in redundant content. Thatā€™s why I filled in ā€œNoneā€ for both Expected Behavior and Current Behavior.

In fact, to ensure that others could understand more easily, I revised the content and format multiple times before submitting. (However, looking at it again now, it seems I should have used symbols like ā€œĀ·ā€ or ā€œ>ā€ to better organize the structure.)

Regarding the issue template, I personally believe that Expected Behavior and Current Behavior could be subheadings under the Overall Description, but this is just my personal opinion. You may want to gather feedback from other users to make a more informed decision.

BryanHeBY commented 1 month ago

Hi YunFy26, I set the value of handle-invokedynamic to true, but I still can't find the IR for $Proxy. Could you please provide me with an environment where this IR output can be reproduced, including the JDK environment, tai-e configuration options, etc.? I noticed that you enabled a custom plugin, pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin. Would this plugin affect the result?

As for the question, Why is org.example.proxy.ServiceImpl.class considered a ConstantObj?, it's because it is the class object (of java.lang.Class type) literal, not the class itself.

YunFy26 commented 1 month ago

@BryanHeBY Apologies for mistakenly assuming that the value of handle-invokedynamic affected the IR output of $Proxy0.

In this repo, after running ./gradlew build, I navigated to build/classes/java/main and executed:

java -Djdk.proxy.ProxyGenerator.saveGeneratedFiles=true -cp . org.example.Main

This caused the bytecode file of the dynamic proxy class to be saved in build/classes/java/main/jdk/proxy1/$Proxy0.class, leading it to be recognized as an application class and subsequently loaded into Tai-e World. As a result, when executing ir-dumper, the IR for $Proxy0 is output as well.

This is unrelated to the missing call edges in method invocations within dynamic proxy classes.

I apologize for my limited expertise, which may have caused inconvenience to the Tai-e team members. I also sincerely appreciate the Tai-e team for addressing my questions.

orlies commented 1 month ago

Tai-e currently does not support handling dynamic proxy. Dynamic proxy generate the bytecode for the proxy class as byte[] upon the first use, then load that proxy class through bytecode, and finally access it via reflection. The semantics of generating the proxy class bytecode are relatively complex, and byte[] is difficult to handle with static analysis. Additionally, Tai-e does not currently support dynamic class loading. In summary, Tai-e does not support such static analysis at this time.

However, the behavior of dynamic proxy is not complex. A proxy class is generated based on the input interfaces. It holds an InvocationHandler, and all the interfaces' methods are delegated to the InvocationHandler (by the way, this does not involve invokeDynamic). We can use Tai-e's plugin system to easily handle the behavior of dynamic proxy in the pointer analysis.

Here, I can provide you with two methods for reference:

Method 1

After generating the proxy classes files (.class file) in the runtime (using -Djdk.proxy.ProxyGenerator.saveGeneratedFiles=true), let Tai-e bypass the code that generates the proxy class (skip the code in Proxy.newProxyInstance) and directly use the generated proxy class. The specific plugin code is as follows:

Plugin code `CustomModel` manages the initialization of the concrete plugin ```java public class CustomModel extends CompositePlugin { @Override public void setSolver(Solver solver) { addPlugin(new IRProxyModel(solver)); } } ``` `IRProxyModel` models the semantics of `Proxy`'s `newProxyInstance` method through generating IR ```java public class IRProxyModel extends IRModelPlugin { private final Set proxyClasses; private final Type invocationHandlerType; IRProxyModel(Solver solver) { super(solver); proxyClasses = getProxyClasses(); invocationHandlerType = typeSystem.getClassType("java.lang.reflect.InvocationHandler"); } private Set getProxyClasses() { JClass proxy = hierarchy.getClass("java.lang.reflect.Proxy"); String regex = "jdk\\.proxy\\d+\\.\\$Proxy\\d+"; Predicate pattern = Pattern.compile(regex).asMatchPredicate(); return hierarchy.getAllSubclassesOf(proxy).stream() .filter(c -> pattern.test(c.getName())) .collect(Collectors.toSet()); } @InvokeHandler(signature = "") public List newProxyInstance(Invoke invoke) { List stmts = new ArrayList<>(); Var result = invoke.getResult(); Var invocationHandler = invoke.getInvokeExp().getArg(2); if (result != null) { for (var proxy : proxyClasses) { for (var method : proxy.getDeclaredMethods()) { if (!method.isConstructor()) { continue; } if (method.getParamCount() == 1 && method.getParamType(0).equals(invocationHandlerType)) { stmts.add(new New(invoke.getContainer(), result, new NewInstance(proxy.getType()))); stmts.add(new Invoke(invoke.getContainer(), new InvokeSpecial(method.getRef(), result, List.of(invocationHandler)))); } } } } return stmts; } } ```


You need to activate the plugin by declaring it in the configuration file. You also need to enable the reflection analysis (because dynamic proxy will use reflection) and disable the only-app option (because you need to analyze the <init> method for Proxy). The call graph is as follow (only the part from main to doSomething):

Partial call graph ```txt ... "122" [label="",]; "13887" [label="",]; "14293" [label="",]; "14857" [label="",]; ... "14293" -> "13887" [label="[4@L8] invokeinterface r2.();",]; "13887" -> "122" [label="[2@L-1] invokeinterface $r2.(%this, $r1, %nullconst);",]; "122" -> "14857" [label="[4@L17] $r5 = invokevirtual method.($r4, args);",]; ... ```


The plugin models Proxy.newProxyInstance (the API for generating dynamic proxy object) through a piece of code (IR) for generating all proxy class objects (through the pre-generated proxy classes' files). After that, Tai-e will directly use these proxy class objects for analysis. Note that generating all proxy class objects in any call to newProxyInstance may indeed reduce precision, but Tai-e will impose some constraints during object propagation based on the type of the object (for example, an object that is actually of type A cannot be cast to type B). You can also achieve higher precision modeling by yourself through the remaining parameters of the API.

This method requires running the program in advance to generate all proxy classes, which is not that 'static'.

Method 2

Since the logic for generating dynamic proxy code is not that difficultā€”requiring the proxy class to directly delegate to the InvocationHandler for the actual operationsā€”we can model the semantics of such delegation. The specific plugin code is as follows:

Plugin code `CustomModel` manages the initialization of the concrete plugin ```java public class CustomModel extends CompositePlugin { @Override public void setSolver(Solver solver) { addPlugin(new SemanticProxyModel(solver)); } } ``` `SemanticProxyModel` model the semantics of delegation ```java public class SemanticProxyModel extends AnalysisModelPlugin { private static final Logger logger = LogManager.getLogger(SemanticProxyModel.class); private static final Descriptor PROXY_DESC = () -> "ProxyObj"; private static final Descriptor REFLECTION_DESC = () -> "ReflectionMetaObj"; private static final Descriptor ARGS_ARRAY_DESC = () -> "Object[Args]Obj"; private final JClass object; private final JClass proxy; private final Set proxiedMethods; SemanticProxyModel(Solver solver) { super(solver); object = Objects.requireNonNull(hierarchy.getJREClass(ClassNames.OBJECT)); proxy = Objects.requireNonNull(hierarchy.getJREClass("java.lang.reflect.Proxy")); proxiedMethods = Set.of( Objects.requireNonNull(object.getDeclaredMethod("hashCode")).getRef(), Objects.requireNonNull(object.getDeclaredMethod("equals")).getRef(), Objects.requireNonNull(object.getDeclaredMethod("toString")).getRef()); } @Override public void onStart() { handlers.keySet().forEach(solver::addIgnoredMethod); } @InvokeHandler(signature = "", argIndexes = {2}) public void newProxyInstance(Context context, Invoke invoke, PointsToSet invocationHandler) { Var result = invoke.getResult(); if (result != null) { invocationHandler.forEach( csObj -> { // generate special mock object for newProxyInstance Obj obj = heapModel.getMockObj(PROXY_DESC, csObj, NullType.NULL); CSMethod csMethod = csManager.getCSMethod(context, invoke.getContainer()); Context heapContext = selector.selectHeapContext(csMethod, obj); solver.addVarPointsTo(context, result, heapContext, obj); } ); } } @Override public void onUnresolvedCall(CSObj recv, Context context, Invoke invoke) { if (!CSObjs.hasDescriptor(recv, PROXY_DESC)) { return; } MethodRef method = invoke.getMethodRef(); JClass clazz = method.getDeclaringClass(); if (clazz.equals(proxy) || (clazz.equals(object) && !proxiedMethods.contains(method))) { // the method is directly called CSCallSite csCallSite = csManager.getCSCallSite(context, invoke); JMethod callee = method.resolve(); Context calleeContext = selector.selectContext( csCallSite, recv, callee); CSMethod csCallee = csManager.getCSMethod(calleeContext, callee); solver.addCallEdge(new Edge<>(CallGraphs.getCallKind(invoke), csCallSite, csCallee)); solver.addVarPointsTo(calleeContext, callee.getIR().getThis(), recv); } else { // the method is actually delegated to InvocationHandler.invoke method CSObj invocationHandler = (CSObj) recv.getObject().getAllocation(); JMethod callee = ((ClassType) invocationHandler.getObject().getType()) .getJClass().getDeclaredMethod("invoke"); if (callee == null) { logger.warn("No invoke method for " + invocationHandler.getObject().getType()); return; } CSCallSite csCallSite = csManager.getCSCallSite(context, invoke); Context calleeContext = selector.selectContext( csCallSite, invocationHandler, callee); CSMethod csCallee = csManager.getCSMethod(calleeContext, callee); solver.addCallEdge(new ProxyCallEdge(csCallSite, csCallee, recv)); solver.addVarPointsTo(calleeContext, callee.getIR().getThis(), invocationHandler); } } @Override public void onNewCallEdge(Edge edge) { if (edge instanceof ProxyCallEdge proxyEdge) { // create arguments for InvocationHandler.invoke CSMethod csCallee = edge.getCallee(); Context callerCtx = edge.getCallSite().getContext(); Invoke callSite = edge.getCallSite().getCallSite(); Context calleeCtx = csCallee.getContext(); JMethod callee = csCallee.getMethod(); InvokeExp invokeExp = callSite.getInvokeExp(); // pass the first argument, which is reflection method solver.addVarPointsTo(callerCtx, callee.getIR().getParam(0), proxyEdge.getProxyObj()); // pass the second argument, which is reflection method JMethod method = callSite.getMethodRef().resolve(); Obj methodObj = heapModel.getMockObj(REFLECTION_DESC, method, typeSystem.getClassType(ClassNames.METHOD)); Context mObjContext = selector.selectHeapContext(proxyEdge.getCallee(), methodObj); solver.addVarPointsTo(callerCtx, callee.getIR().getParam(1), mObjContext, methodObj); // pass the third argument, which is args in Object[] Type objs = typeSystem.getArrayType(typeSystem.getType(ClassNames.OBJECT), 1); Obj argsObj = heapModel.getMockObj(ARGS_ARRAY_DESC, callSite, objs); Context argsObjContext = selector.selectHeapContext(proxyEdge.getCallee(), argsObj); ArrayIndex arrayIdx = csManager.getArrayIndex(csManager.getCSObj(argsObjContext, argsObj)); callSite.getInvokeExp().getArgs().forEach( v -> { CSVar csVar = csManager.getCSVar(callerCtx, v); PointsToSet pts = solver.getPointsToSetOf(csVar); solver.addPointsTo(arrayIdx, pts); } ); solver.addVarPointsTo(callerCtx, callee.getIR().getParam(2), argsObjContext, argsObj); // pass results to LHS variable Var lhs = callSite.getResult(); if (lhs != null) { CSVar csLHS = csManager.getCSVar(callerCtx, lhs); for (Var ret : callee.getIR().getReturnVars()) { CSVar csRet = csManager.getCSVar(calleeCtx, ret); solver.addPFGEdge(csRet, csLHS, FlowKind.RETURN); } } } } } ``` `ProxyCallEdge` is the special call edge handled by the `SemanticProxyModel` plugin ```java public class ProxyCallEdge extends OtherEdge { private final CSObj proxyObj; public ProxyCallEdge(CSCallSite callSite, CSMethod callee, CSObj proxyObj) { super(callSite, callee); this.proxyObj = proxyObj; } public CSObj getProxyObj() { return proxyObj; } } ```


You need to activate the plugin by declaring it in the configuration file. You also need to enable the reflection analysis (because dynamic proxy will use reflection). You can set the only-app option to true. The call graph is as follow, Main.main -> MyInvocationHandler.invoke -> ServiceImpl.doSomething:

Call graph ```dot digraph G { node [color=".3 .2 1.0",shape=box,style=filled]; edge []; "0" [label="",]; "1" [label="()>",]; "2" [label="()>",]; "3" [label="",]; "4" [label="",]; "5" [label="",]; "6" [label="",]; "7" [label="",]; "8" [label="",]; "9" [label="",]; "10" [label="()>",]; "11" [label="()>",]; "12" [label="(java.lang.Object)>",]; "13" [label="",]; "0" -> "13" [label="[0@L24] $r1 = invokevirtual r0.();",]; "0" -> "3" [label="[1@L24] $r2 = invokevirtual $r1.();",]; "0" -> "12" [label="[5@L25] invokespecial $r5.(java.lang.Object)>(r0);",]; "0" -> "6" [label="[3@L25] $r4 = invokevirtual $r3.();",]; "0" -> "13" [label="[2@L25] $r3 = invokevirtual r0.();",]; "0" -> "5" [label="[6@L23] $r6 = invokestatic ($r2, $r4, $r5);",]; "8" -> "0" [label="[2@L7] $r1 = invokestatic ($r0);",]; "8" -> "10" [label="[1@L6] invokespecial $r0.()>();",]; "8" -> "9" [label="[4@L8] invokeinterface r2.();",]; "9" -> "7" [label="[4@L17] $r5 = invokevirtual method.($r4, args);",]; "9" -> "4" [label="[4@L17] $r5 = invokevirtual method.($r4, args);",]; "10" -> "2" [label="[0@L1] invokespecial %this.()>();",]; "12" -> "2" [label="[0@L9] invokespecial %this.()>();",]; } ```


This plugin models Proxy.newProxyInstance through its semantics. This plugin generates a special MockObj, and Tai-e will use this plugin to handle method calls when attempting to invoke methods on that object (and for convenience, the mock object is modeled as a null type for propagation). For methods that need to be proxied, the plugin will generate a special call edge and create parameters to invoke the InvocationHandler.invoke method.

In reality, the object of the proxy class is a subclass that implements the proxied interfaces. To further improve this plugin, you can specially handle the propagation of the object through the interfaces parameter when calling the Proxy.newProxyInstance method. However, currently, Tai-e does not support interface-related reflection API, so you would need to implement the plugin by yourself. At the same time, Tai-e does not have good customization methods for object propagation, which may require relatively complex modifications.

As for the question, Could you explain why the line number is shown as -1?, -1 means this IR does not corresponds to a line in the source code, which is the case since the whole class $Proxy0 is automatically generated.

YunFy26 commented 4 weeks ago

Thank you for providing such a detailed solution, and apologies for the delayed response. Iā€™ll proceed with handling the situation based on your suggestions. Also, I must add, Tai-e is truly a powerful and user-friendly analysis framework!