wala / ML

Eclipse Public License 2.0
25 stars 17 forks source link

Missing the print built-in function summary #93

Closed khatchad closed 11 months ago

khatchad commented 11 months ago

Description

Consider the following example input program:

# A.py
def f(x):
  print("Traced with: " + str(x))

f(1)

This is the IR of f():

Node: <Code body of function Lscript A.py/f> Context: CallStringContext: [ script A.py.do()LRoot;@97 ]
<Code body of function Lscript A.py/f>
CFG:
BB0[-1..-2]
    -> BB1
BB1[0..2]
    -> BB2
    -> BB3
BB2[3..4]
    -> BB3
BB3[-1..-2]
Instructions:
BB0
BB1
0   v5 = lexical:print@Lscript A.py          A.py [2:2] -> [2:7]
1   v10 = lexical:str@Lscript A.py           A.py [2:26] -> [2:29]
2   v8 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v10,v2 @2 exception:v11A.py [2:26] -> [2:32] [2=[x]]
BB2
3   v6 = binaryop(add) v7:#Traced with:  , v8A.py [2:8] -> [2:32]
4   v3 = invokeFunction < PythonLoader, LCodeBody, do()LRoot; > v5,v6 @4 exception:v12A.py [2:2] -> [2:33]
BB3

Above, v5 is the print() function, while v10 is the str() function. However, in the pointer analysis, I am seeing the following:

  [Node: <Code body of function Lscript A.py/f> Context: CallStringContext: [ script A.py.do()LRoot;@97 ], v5] ->
  [Node: <Code body of function Lscript A.py/f> Context: CallStringContext: [ script A.py.do()LRoot;@97 ], v10] ->
     [com.ibm.wala.cast.python.ipa.summaries.BuiltinFunctions$BuiltinFunction@1724e9f5]

The points-to set for v5 in f() of A.py is empty, while the points-to set of v10 contains the built-in.

Regression

I believe that the print() function needs to be added here:

https://github.com/wala/ML/blob/1b1ffac127c0c8f48a11d2c661b71450c9d60ce9/com.ibm.wala.cast.python/source/com/ibm/wala/cast/python/ipa/summaries/BuiltinFunctions.java#L274-L289