wala / ML

Eclipse Public License 2.0
25 stars 17 forks source link

Losing tensors in lists #136

Open khatchad opened 9 months ago

khatchad commented 9 months ago

Related to https://github.com/wala/ML/issues/89.

We actually mention this problem in #89:

import tensorflow as tf

def add(a, b):
  return a + b

list = list()

list.append(tf.ones([1, 2]))
list.append(tf.ones([2, 2]))

for element in list:
    c = add(element, element)

Gets us:

Oct 11, 2023 2:51:26 PM com.ibm.wala.cast.python.ml.client.PythonTensorAnalysisEngine getDataflowSources
INFO: Added dataflow source [Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_tensor_list3.py.do()LRoot;@106 ], v5]:[Empty].
Oct 11, 2023 2:51:26 PM com.ibm.wala.cast.python.ml.client.PythonTensorAnalysisEngine getDataflowSources
INFO: Added dataflow source [Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_tensor_list3.py.do()LRoot;@99 ], v5]:[Empty].
Oct 11, 2023 2:51:26 PM com.ibm.wala.cast.python.ml.test.TestTensorflowModel testTf2
INFO: Tensor analysis: answer:
[Node: <Code body of function Lscript tf2_test_tensor_list3.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ], v264][{[D:Symbolic,n, D:Compound,[D:Constant,28, D:Constant,28]] of pixel}]
[Ret-V:Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_tensor_list3.py.do()LRoot;@106 ]][{[D:Symbolic,n, D:Compound,[D:Constant,28, D:Constant,28]] of pixel}]
[Node: <Code body of function Lscript tf2_test_tensor_list3.py> Context: CallStringContext: [ com.ibm.wala.FakeRootClass.fakeRootMethod()V@2 ], v252][{[D:Symbolic,n, D:Compound,[D:Constant,28, D:Constant,28]] of pixel}]
[Ret-V:Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_tensor_list3.py.do()LRoot;@99 ]][{[D:Symbolic,n, D:Compound,[D:Constant,28, D:Constant,28]] of pixel}]
[Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_tensor_list3.py.do()LRoot;@106 ], v5][{[D:Symbolic,n, D:Compound,[D:Constant,28, D:Constant,28]] of pixel}]
[Node: synthetic < PythonLoader, Ltensorflow/functions/ones, do()LRoot; > Context: CallStringContext: [ script tf2_test_tensor_list3.py.do()LRoot;@99 ], v5][{[D:Symbolic,n, D:Compound,[D:Constant,28, D:Constant,28]] of pixel}]

While we could employ a similar fix for #89, I don't think it's ideal. First, a list is not a dataset. Second, I am seeing examples where @tf.function is used on functions that take lists of tensors as arguments, which makes sense because of the trace type:

https://github.com/aymericdamien/TensorFlow-Examples/blob/6dcbe14649163814e72a22a999f20c5e247ce988/tensorflow_v2/notebooks/6_Hardware/multigpu_training.ipynb#L204-L218

khatchad commented 9 months ago

I don't know if the trace type of datasets would have it make sense to pass them as arguments to hybrid functions, but I haven't seen this done.