Open YunFy26 opened 1 month ago
Thank you for taking the time to provide such detailed information. This seems to be a rather important issue, we'll take the time to look into it after being free.
Before we investigate this issue further, we would like to conduct a user study to understand your experience with our GitHub Issue Template. Specifically, we want to determine if there are any organizational, descriptive or structural aspects of the template that make it difficult/undesirable for you to follow when submitting an issue.
I apologize for not strictly adhering to the issue template format when submitting my issue. Iād like to explain the reason behind this.
When describing my example in the Overall Description, whether itās for this issue or previous ones, I find it difficult to separate the Expected Behavior and Current Behavior from the Overall Description. When describing the issue, I always feel that placing Expected Behavior and Current Behavior as separate headings after the Overall Description creates a sense of ādisconnection.ā It feels like it disrupts the flow of the explanation.
Taking this submission as an example, I want to analyze the function calls related to dynamic proxies. I first provided a brief description in the title: ācall edges related to dynamic proxy are missing.ā Then, in the Overall Description, I started by offering a demo as a sample for analysis.
ā Demo
Afterward, I presented the resulting call graph and explained the outcome of this analysis.
ā”The call edgeĀ
main
Ā āĀdoSomething
Ā is missing.
Next, I described the actual runtime call sequence:
ā¢In the actual runtime call sequence, beforeĀ
doSomething
Ā is called, the methodĀinvoke
Ā ofĀMyInvocationHandler
Ā will be called, and thenĀdoSomething
Ā is called through reflection within theĀinvoke
Ā method.
In this process:
ā is the Reproducible Example
ā” is the Current Behavior
ā¢ is the Expected Behavior(Perhaps I didnāt describe it clearly enough. I should have included a call chain like: main
-> invoke
-> doSomething
as Expected Behavior.)
If I strictly followed the template, the structure would probably look like this: I would first describe the issue in the Overall Description, then follow with either a ā¢ā”ā or ā”ā¢ā format.
Personally, I believe that describing the entire process directly in the Overall Description makes it easier to follow and understand. Therefore, I placed everything in the Description section. In this case, if I were to follow the template strictly, it would result in redundant content. Thatās why I filled in āNoneā for both Expected Behavior and Current Behavior.
In fact, to ensure that others could understand more easily, I revised the content and format multiple times before submitting. (However, looking at it again now, it seems I should have used symbols like āĀ·ā or ā>ā to better organize the structure.)
Regarding the issue template, I personally believe that Expected Behavior and Current Behavior could be subheadings under the Overall Description, but this is just my personal opinion. You may want to gather feedback from other users to make a more informed decision.
Hi YunFy26, I set the value of handle-invokedynamic
to true, but I still can't find the IR for $Proxy
. Could you please provide me with an environment where this IR output can be reproduced, including the JDK environment, tai-e configuration options, etc.? I noticed that you enabled a custom plugin, pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin
. Would this plugin affect the result?
As for the question, Why is org.example.proxy.ServiceImpl.class considered a ConstantObj?
, it's because it is the class object (of java.lang.Class type) literal, not the class itself.
@BryanHeBY Apologies for mistakenly assuming that the value of handle-invokedynamic
affected the IR output of $Proxy0
.
In this repo, after running ./gradlew build
, I navigated to build/classes/java/main
and executed:
java -Djdk.proxy.ProxyGenerator.saveGeneratedFiles=true -cp . org.example.Main
This caused the bytecode file of the dynamic proxy class to be saved in build/classes/java/main/jdk/proxy1/$Proxy0.class
, leading it to be recognized as an application class and subsequently loaded into Tai-e World
. As a result, when executing ir-dumper
, the IR for $Proxy0
is output as well.
This is unrelated to the missing call edges in method invocations within dynamic proxy classes.
I apologize for my limited expertise, which may have caused inconvenience to the Tai-e team members. I also sincerely appreciate the Tai-e team for addressing my questions.
Tai-e currently does not support handling dynamic proxy. Dynamic proxy generate the bytecode for the proxy class as byte[]
upon the first use, then load that proxy class through bytecode, and finally access it via reflection. The semantics of generating the proxy class bytecode are relatively complex, and byte[]
is difficult to handle with static analysis. Additionally, Tai-e does not currently support dynamic class loading. In summary, Tai-e does not support such static analysis at this time.
However, the behavior of dynamic proxy is not complex. A proxy class is generated based on the input interfaces. It holds an InvocationHandler
, and all the interfaces' methods are delegated to the InvocationHandler
(by the way, this does not involve invokeDynamic
). We can use Tai-e's plugin system to easily handle the behavior of dynamic proxy in the pointer analysis.
Here, I can provide you with two methods for reference:
After generating the proxy classes files (.class file) in the runtime (using -Djdk.proxy.ProxyGenerator.saveGeneratedFiles=true
), let Tai-e bypass the code that generates the proxy class (skip the code in Proxy.newProxyInstance
) and directly use the generated proxy class. The specific plugin code is as follows:
You need to activate the plugin by declaring it in the configuration file. You also need to enable the reflection analysis (because dynamic proxy will use reflection) and disable the only-app
option (because you need to analyze the <init>
method for Proxy
). The call graph is as follow (only the part from main
to doSomething
):
The plugin models Proxy.newProxyInstance
(the API for generating dynamic proxy object) through a piece of code (IR) for generating all proxy class objects (through the pre-generated proxy classes' files). After that, Tai-e will directly use these proxy class objects for analysis. Note that generating all proxy class objects in any call to newProxyInstance
may indeed reduce precision, but Tai-e will impose some constraints during object propagation based on the type of the object (for example, an object that is actually of type A cannot be cast to type B). You can also achieve higher precision modeling by yourself through the remaining parameters of the API.
This method requires running the program in advance to generate all proxy classes, which is not that 'static'.
Since the logic for generating dynamic proxy code is not that difficultārequiring the proxy class to directly delegate to the InvocationHandler
for the actual operationsāwe can model the semantics of such delegation. The specific plugin code is as follows:
You need to activate the plugin by declaring it in the configuration file. You also need to enable the reflection analysis (because dynamic proxy will use reflection). You can set the only-app
option to true
. The call graph is as follow, Main.main
-> MyInvocationHandler.invoke
-> ServiceImpl.doSomething
:
This plugin models Proxy.newProxyInstance
through its semantics. This plugin generates a special MockObj
, and Tai-e will use this plugin to handle method calls when attempting to invoke methods on that object (and for convenience, the mock object is modeled as a null type for propagation). For methods that need to be proxied, the plugin will generate a special call edge and create parameters to invoke the InvocationHandler.invoke
method.
In reality, the object of the proxy class is a subclass that implements the proxied interfaces. To further improve this plugin, you can specially handle the propagation of the object through the interfaces parameter when calling the Proxy.newProxyInstance
method. However, currently, Tai-e does not support interface-related reflection API, so you would need to implement the plugin by yourself. At the same time, Tai-e does not have good customization methods for object propagation, which may require relatively complex modifications.
As for the question, Could you explain why the line number is shown as -1?
, -1 means this IR does not corresponds to a line in the source code, which is the case since the whole class $Proxy0
is automatically generated.
Thank you for providing such a detailed solution, and apologies for the delayed response. Iāll proceed with handling the situation based on your suggestions. Also, I must add, Tai-e is truly a powerful and user-friendly analysis framework!
š Overall Description
### For the following demo
`Service.java` ```java public interface Service { void doSomething(); } ``` `ServiceImpl.java` ```java public class ServiceImpl implements Service { @Override public void doSomething() { System.out.println("Performing task in ServiceImpl..."); } } ``` `MyInvocationHandler.java` ```java public class MyInvocationHandler implements InvocationHandler { private final Object target; public MyInvocationHandler(Object target) { this.target = target; } @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { System.out.println("before method call..."); // method invoke Object result = method.invoke(target, args); System.out.println("after method call..."); return result; } public static Object getProxy(Object target) { return Proxy.newProxyInstance( target.getClass().getClassLoader(), target.getClass().getInterfaces(), new MyInvocationHandler(target) ); } } ``` `Main.java` ```java public class Main { public static void main(String[] args) { ServiceImpl service = new ServiceImpl(); Service proxy = (Service) MyInvocationHandler.getProxy(service); proxy.doSomething(); } } ``` `IR` of `Main.java` ```java public static void main(java.lang.String[] r3) { org.example.proxy.ServiceImpl $r0; java.lang.Object $r1; org.example.proxy.Service r2; [0@L10] $r0 = new org.example.proxy.ServiceImpl; [1@L10] invokespecial $r0.The
call-graph
as follows:The call edge
main
ādoSomething
is missing.In the actual runtime call sequence, before
doSomething
is called, the methodinvoke
ofMyInvocationHandler
will be called, and thendoSomething
is called through reflection within theinvoke
method.After completing the pointer analysis, I reviewed the results of the analysis.
solver.csManager.callSites
includesļ¼solver.csManager.ptrManager.vars.map
includes varr2
ļ¼but the pointsToSet ofr2
isnull
, As shown in Figure-1At runtime, the type of
r2
isjdk.proxy1.$Proxy0
Since
$Proxy0
is generated at runtime, Tai-e is unable to identify the allocation site for this object. So there is no Object mocked, which results in the missing call edge. Is my understanding correct?According to #114 ļ¼
Does this imply that Tai-e does not yet natively support method calls in dynamic proxy? If Tai-e supports handling method calls within proxy classes, what configurations should I modify?
Moreover, I have observed that
solver.csManager.objManager.objMap
contains:ļ¼as shown in Figure-2ļ¼Why is
org.example.proxy.ServiceImpl.class
considered aConstantObj
?Additionally, in
tai-e-analyses.yml
, I set the value ofhandle-invokedynamic
totrue
. Tai-e output the IR of$Proxy
:I have a few questions regarding this IR. Could you explain why the line number is shown as
-1
?šÆ Expected Behavior
None
š Current Behavior
None
š Reproducible Example
No response
āļø Tai-e Arguments
š Click here to see Tai-e Options
```yaml optionsFile: null printHelp: false classPath: - ../Tai-e_Test/build/classes/java/main appClassPath: - ../Tai-e_Test/build/classes/java/main mainClass: org.example.Main inputClasses: [] javaVersion: 17 prependJVM: true allowPhantom: true worldBuilderClass: pascal.taie.frontend.soot.SootWorldBuilder outputDir: output preBuildIR: false worldCacheMode: false scope: APP nativeModel: true planFile: null analyses: ir-dumper: "" cg: "" cfg: "" pta: "plugins:[pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin]" onlyGenPlan: false keepResult: - $KEEP-ALL ```š Click here to see Tai-e Analysis Plan
```yaml - id: ir-dumper options: {} - id: pta options: cs: 1-obj only-app: true implicit-entries: false distinguish-string-constants: reflection merge-string-objects: true merge-string-builders: true merge-exception-objects: true handle-invokedynamic: true propagate-types: - reference advanced: null dump: false dump-ci: false dump-yaml: false expected-file: null reflection-inference: string-constant reflection-log: null taint-config: null taint-config-providers: [] taint-interactive-mode: false plugins: - pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin time-limit: -1 - id: cg options: algorithm: pta dump: true dump-methods: true dump-call-edges: true - id: throw options: exception: explicit algorithm: intra - id: cfg options: exception: explicit dump: true ```š Tai-e Log
š Click here to see Tai-e Log
``` Writing log to /Users/yuntsy/My/Projects/Java/Tai-e/output/tai-e.log java.version: 17.0.11 java.version.date: 2024-04-16 java.runtime.version: 17.0.11+7-LTS-207 java.vendor: Oracle Corporation java.vendor.version: null os.name: Mac OS X os.version: 15.0.1 os.arch: aarch64 Tai-e Version: 0.5.1-SNAPSHOT Tai-e Commit: 46448829b6c19ae414caea7b43bd7fb8792ac0a5 Writing analysis plan to /Users/yuntsy/My/Projects/Java/Tai-e/output/tai-e-plan.yml WorldBuilder starts ... 10085 classes with 99482 methods in the world WorldBuilder finishes, elapsed time: 1.62s ir-dumper starts ... Dumping IR in /Users/yuntsy/My/Projects/Java/Tai-e/output/tir 5 classes in scope (APP) of class analyses ir-dumper finishes, elapsed time: 0.03s pta starts ... [Pointer analysis] elapsed time: 0.01s -------------- Pointer analysis statistics: -------------- #var pointers: 12 (insens) / 12 (sens) #objects: 5 (insens) / 5 (sens) #var points-to: 9 (insens) / 9 (sens) #static field points-to: 0 (sens) #instance field points-to: 1 (sens) #array points-to: 1 (sens) #reachable methods: 9 (insens) / 10 (sens) #call graph edges: 10 (insens) / 10 (sens) ---------------------------------------- pta finishes, elapsed time: 0.11s cg starts ... Call graph has 9 reachable methods and 10 edges Dumping call graph to /Users/yuntsy/My/Projects/Java/Tai-e/output/call-graph.dot Dumping reachable methods to /Users/yuntsy/My/Projects/Java/Tai-e/output/reachable-methods.txt Dumping call edges to /Users/yuntsy/My/Projects/Java/Tai-e/output/call-edges.txt cg finishes, elapsed time: 0.01s throw starts ... 14 methods in scope (APP) of method analyses throw finishes, elapsed time: 0.00s cfg starts ... Dumping CFGs in /Users/yuntsy/My/Projects/Java/Tai-e/output/cfg cfg finishes, elapsed time: 0.01s Tai-e finishes, elapsed time: 1.88s ```ā¹ļø Additional Information
No response