ponder-lab / Hybridize-Functions-Refactoring

Refactorings for optimizing imperative TensorFlow clients for greater efficiency.
Eclipse Public License 2.0
0 stars 0 forks source link

Objects whose classes are declared in a different file are missing call graph nodes #311

Closed khatchad closed 7 months ago

khatchad commented 8 months ago

We can't find this function:

https://github.com/ponder-lab/samples/blob/39f7644391e664244b45c90868c804abad923eb3/tensorflow_padding/my_layers.py#L12-L19

But it's called here:

https://github.com/ponder-lab/samples/blob/39f7644391e664244b45c90868c804abad923eb3/tensorflow_padding/tensorflow_padding.py#L23C22-L23C22

khatchad commented 8 months ago

In the PA, I'm seeing:

[Node: <Code body of function Lscript tensorflow_padding.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@38 ], v306] --> []

Variable v306 refers to Padding2D.

khatchad commented 7 months ago

It looks like "field C" isn't found, i.e., it has no points-to set. But, C is a class not a field. But, since it's referenced as B.C, maybe it thinks it's a field.

khatchad commented 7 months ago

But, if I use the from x import Y syntax, that's not a field read, and it still can't find it.

khatchad commented 7 months ago

It looks like classes aren't added as fields of the script but functions are.

callees of node Lscript B.py : []

IR of node 2, context CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]
<Code body of function Lscript B.py>

90   v241 = new <PythonLoader,Lscript B.py/C>@90B.py [1:0] -> [2:8] [241=[C]]
92   global:global script B.py/C = v241      B.py [1:0] -> [2:8] [241=[C]]
93   v243 = new <PythonLoader,Lscript B.py/g>@93<no information> [243=[g]]
94   global:global script B.py/g = v243      <no information> [243=[g]]
95   putfield v1.< PythonLoader, LRoot, g, <PythonLoader,LRoot> > = v243<no information> [243=[g]]

Instruction 95 above adds function g to the script as a field but it does not add class C.

Later on, when we import the external file, in the file that does the import, it tries to load these from the fields.

90   v4 = global:global script B.py          A.py [1:0] -> [15:3] [4=[g]]
91   v5 = fieldref v4.v244:#g                A.py [1:0] -> [15:3] [5=[g]4=[g]]
92   lexical:g@Lscript A.py = v5             A.py [1:0] -> [15:3] [5=[g]]

In the PA, we have:

[Node: <Code body of function Lscript A.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@4 ], v5] --> [SITE_IN_NODE{<Code body of function Lscript B.py>:Lscript B.py/g in CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]}]

For the class:

95   v19 = global:global script B.py         A.py [1:0] -> [15:3] [19=[C]]
96   v245 = fieldref v19.v246:#C             A.py [1:0] -> [15:3] [245=[C]19=[C]]
97   lexical:C@Lscript A.py = v245           A.py [1:0] -> [15:3] [245=[C]]

In the PA, we have:

[Node: <Code body of function Lscript A.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@4 ], v245] --> []

The weird thing is that we have in the PA both of these:

[<field global script B.py/g>] --> [SITE_IN_NODE{<Code body of function Lscript B.py>:Lscript B.py/g in CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]}]
[<field global script B.py/C>] --> [SITE_IN_NODE{<Code body of function Lscript B.py>:Lscript B.py/C in CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]}]

Later, in the calling context, we see:

0   v4 = lexical:g@Lscript A.py              A.py [10:4] -> [10:5]
1   v2 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v4 @1 exception:v5A.py [10:4] -> [10:7]
...
4   v12 = lexical:C@Lscript A.py             A.py [12:4] -> [12:5]
5   v10 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v12 @5 exception:v13A.py [12:4] -> [12:7]

In the PA:

[Node: <Code body of function Lscript A.py/f> Context: CallStringContext: [ script A.py.do()LRoot;@108 ], v4] --> [SITE_IN_NODE{<Code body of function Lscript B.py>:Lscript B.py/g in CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ]}]

[Node: <Code body of function Lscript A.py/f> Context: CallStringContext: [ script A.py.do()LRoot;@108 ], v12] --> []

I wonder if we should add a field write in the defining script like:

90   v241 = new <PythonLoader,Lscript B.py/C>@90B.py [1:0] -> [2:8] [241=[C]]
92   global:global script B.py/C = v241      B.py [1:0] -> [2:8] [241=[C]]
95   putfield v1.< PythonLoader, LRoot, C, <PythonLoader,LRoot> > = v241<no information> [241=[C]]
93   v243 = new <PythonLoader,Lscript B.py/g>@93<no information> [243=[g]]
94   global:global script B.py/g = v243      <no information> [243=[g]]
95   putfield v1.< PythonLoader, LRoot, g, <PythonLoader,LRoot> > = v243<no information> [243=[g]]
khatchad commented 7 months ago

But why would we not need this for classes defined in the same file? We don't ask the script for a field corresponding to a class in that case.