Open aascorreia opened 3 weeks ago
Hi @aascorreia.
Conceptually, method increment
performs READ
and WRITE
operations on objects rather than variables.
Here c1
and c2
are variables, while NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
and NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}
are objects.
Since objects can be passed around and assigned to variables with different names, the two Counter
objects's name aren't really c1
and c2
.
Tai-e prefix abstract objects with their allocation site (i.e., the method where allocations happen).
As you can see, those two Counter
objects are allocated in the method void main(String[])
.
A possible solution to your problem would be roughly like this:
Load
instruction x = y.f
in method increment
, record a READ
operation on objects in pts(y)
(using possibly a map with Obj
as its key), where pts(y)
refers to the points-to-set of the variable y
.Store
instruction x.f = y
, record a WRITE
operation on objects in pts(x)
.c1
and c2
in method void main(String[])
), scan the points-to-set of the variable, lookup what operations are performed on it, and store those operations back to a map whose keys are variables.Thank you for shedding light on the difference between objects and variables. I did not consider that aspect of objects at first and can now understand why Obj does not store the name of either c1 or c2.
However, I am a bit confused with Step 4. For reference, here is the code that is being executed after analysis is done, as I believe Steps 2 and 3 have already been accomplished to some extent.
PointerAnalysisResultImpl result = World.get().getResult("pta");
Collection<CSVar> csVars = result.getCSVars();
FieldAccessMap ptaInfo = new FieldAccessMap();
if (!csVars.isEmpty())
for (CSVar var : csVars) {
for (Obj obj : result.getPointsToSet(var.getVar()))
System.out.println(var.getVar().getName() + "=> " + obj);
System.out.println("-".repeat(100));
if (!var.getVar().getLoadFields().isEmpty())
for (LoadField lField : var.getVar().getLoadFields())
ptaInfo.recordAccess(
lField.getFieldAccess().getFieldRef().getName(),
var.getVar().getMethod().getName(),
AccessType.READ
);
if (!var.getVar().getStoreFields().isEmpty())
for (StoreField sField : var.getVar().getStoreFields())
if (!sField.getRValue().getMethod().getSignature().contains("<init>"))
ptaInfo.recordAccess(
sField.getFieldAccess().getFieldRef().getName(),
var.getVar().getMethod().getName(),
AccessType.WRITE
);
}
ptaInfo.printAccessMap();
FieldAccessMap is what holds the map I initially mentioned.
I believe this is what you want.
if (!var.getVar().getLoadFields().isEmpty()) {
for (LoadField lField : var.getVar().getLoadFields()) {
for (Obj obj : result.getPointsToSet(var.getVar())) {
ptaInfo.recordAccess(
obj + lField.getFieldAccess().getFieldRef().getName(),
var.getVar().getMethod().getName(),
AccessType.READ
);
}
}
}
if (!var.getVar().getStoreFields().isEmpty()) {
for (StoreField sField : var.getVar().getStoreFields()) {
if (!sField.getRValue().getMethod().getSignature().contains("<init>")) {
for (Obj obj : result.getPointsToSet(var.getVar())) {
ptaInfo.recordAccess(
obj + sField.getFieldAccess().getFieldRef().getName(),
var.getVar().getMethod().getName(),
AccessType.WRITE
);
}
}
}
}
and the output will be something like this
NewObj{<Counter: void main(java.lang.String[])>[2@L4] new Counter}counter: [<increment, READ>, <increment, WRITE>, <increment, READ>, <increment, WRITE>]
NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}counter: [<increment, READ>, <increment, WRITE>, <increment, READ>, <increment, WRITE>]
I see. Given your answer though, I'm assuming there really is no way to know the name of a variable since, from what I understood, var.getVar().getName()
retrieves a reference to memory rather than an actual name, and NewObj is referencing the object itself.
I was hoping that, even if Tai-e cannot provide that bit of information (c1 and c2 as names), it would be possible to implement a new plugin, or use an existing one, for that effect.
I haven't fully understood what you are discussing.
var.getVar().getName()
retrieves a reference to memory rather than an actual name,
This is incorrect. If var
is a CSVar
object, then var.getVar()
returns a Var
object. The CSVar
simplely means 'a Var
with a Context
'. Naturally, var.getVar().getName()
retrieves the name of the variable.
In fact, the code snippet below is already capable of outputting results with variable names:
PointerAnalysisResultImpl result = World.get().getResult("pta");
Collection<CSVar> csVars = result.getCSVars();
if (!csVars.isEmpty())
for (CSVar var : csVars) {
for (Obj obj : result.getPointsToSet(var.getVar()))
System.out.println(var.getVar().getName() + " => " + obj);
System.out.println("-".repeat(100));
}
The execution result on my local environment is:
temp$1 => NewObj{<Counter: void main(java.lang.String[])>[3@L4] new Counter}
----------------------------------------------------------------------------------------------------
%this => NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}
%this => NewObj{<Counter: void main(java.lang.String[])>[3@L4] new Counter}
----------------------------------------------------------------------------------------------------
%this => NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}
%this => NewObj{<Counter: void main(java.lang.String[])>[3@L4] new Counter}
----------------------------------------------------------------------------------------------------
c1 => NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}
----------------------------------------------------------------------------------------------------
args => EntryPointObj{alloc=MethodParam{<Counter: void main(java.lang.String[])>/0},type=java.lang.String[] in <Counter: void main(java.lang.String[])>}
----------------------------------------------------------------------------------------------------
c => NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}
c => NewObj{<Counter: void main(java.lang.String[])>[3@L4] new Counter}
----------------------------------------------------------------------------------------------------
c => NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}
c => NewObj{<Counter: void main(java.lang.String[])>[3@L4] new Counter}
----------------------------------------------------------------------------------------------------
temp$0 => NewObj{<Counter: void main(java.lang.String[])>[0@L3] new Counter}
----------------------------------------------------------------------------------------------------
c2 => NewObj{<Counter: void main(java.lang.String[])>[3@L4] new Counter}
Additional note: It seems you are using .class
files as input for Tai-e. If you want to see variable names from the source code, don't forget to add the -g
parameter in your compilation command.
I can assure you that I had added the -g
parameter to the compilation command, as this is the script I am using for compilation, before performing any analysis:
cd src
javac -g -d ../bin -cp ../bin/tai-e-all-0.5.1-SNAPSHOT.jar classes/*.java *.java
cd ../
Even the IDE is correctly setup to generate debugging information during compilation.
The results I was getting when providing .class
files as input, even after verifying that -g
is present, is the following:
r2 => EntryPointObj{alloc=MethodParam{<classes.Counter: void main(java.lang.String[])>/0},type=java.lang.String[] in <classes.Counter: void main(java.lang.String[])>}
----------------------------------------------------------------------------------------------------
%this => NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
%this => NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}
----------------------------------------------------------------------------------------------------
%this => NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
%this => NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}
----------------------------------------------------------------------------------------------------
r0 => NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
r0 => NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}
----------------------------------------------------------------------------------------------------
r0 => NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
r0 => NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}
----------------------------------------------------------------------------------------------------
$r0 => NewObj{<classes.Counter: void main(java.lang.String[])>[0@L5] new classes.Counter}
----------------------------------------------------------------------------------------------------
$r1 => NewObj{<classes.Counter: void main(java.lang.String[])>[2@L6] new classes.Counter}
Now, given your comment:
It seems you are using `.class files as input for Tai-e.
I attempted to use src as the classpath for analysis rather than bin, and managed to get output similar to yours. Thank you!
๐ Overall Description
Hello!
I'm currently performing Tai-e's PTA over a test class Counter with the following options:
The class itself looks like this:
The goal in mind is to gather information regarding the objects invoking the various class methods, identifying the fields involved in read and/or write operations.
By iterating through LoadField and StoreField statements for each variable provided by PointerAnalysisResultImpl.getVars(), I am able to access the field references that are subject to read and write operations, respectively. This information is then used to populate a map whose keys correspond to said field references' names, and values store objects representing the variable's access information (method name and access type).
When running Tai-e's PTA over Counter, I get the following information:
Since both c1 and c2 call increment, it makes sense that two instances of read and write operations are captured. The issue is that it would be ideal to separate these two instances into distinct map keys such that:
While it is possible to obtain the points-to set of a given variable using PointerAnalysisResultImpl.getPointsToSet(Var), I cannot seem to find a way to "resolve" the retrieved Obj objects (example shown below) to get the corresponding object names that are present in the code (c1 and c2 in this case).
Is it feasible to obtain this information, or does the framework not allow it? Should I be using other analysis options or plugins?
Additionally, if Counter's field was instead a reference to another class which contains the counter that is modified in increment, would I have to call getPointsToSet recursively to get to c1.ref.counter, for example?
Thank you for your time.
๐ฏ Expected Behavior
Printing my map should output to the console:
๐ Current Behavior
Because I cannot obtain the actual object names currently, the console displays:
๐ Reproducible Example
No response
โ๏ธ Tai-e Arguments
No response
๐ Tai-e Log
No response
โน๏ธ Additional Information
No response