wala / ML

Eclipse Public License 2.0
25 stars 17 forks source link

Supporting Python 3 causes tensor dimensions not to be calculated #42

Open khatchad opened 1 year ago

khatchad commented 1 year ago

Replacing com.ibm.wala.cast.python.jython.test with com.ibm.wala.cast.python.jython3.test as a dependency in com.ibm.wala.cast.python.ml.test/pom.xml causes com.ibm.wala.cast.python.ml.test.TestNeuroImageExamples.testEx1CG to fail. Specifically, tensor dimensions aren't being calculated because Jython3 is not doing a constant propagation that Jython is doing. Found during https://github.com/ponder-lab/ML/issues/4#issuecomment-1558310839. See https://github.com/ponder-lab/ML/issues/4#issuecomment-1559961372 for more details.

khatchad commented 1 year ago

Looks like the constant propagation isn't working for Jython3:

Test output when using Jython 3

no exceptions for CALL
  OBJECT_REF
    VAR
      "image"
    "set_shape"
  EMPTY
  OBJECT_LITERAL
    NEW
      "list"
    "0"
    BINARY_EXPR
      "*"
      BINARY_EXPR
        "*"
        "40"
        "40"
      "40"

Test Output When Using Jython

no exceptions for CALL
  OBJECT_REF
    VAR
      "image"
    "set_shape"
  EMPTY
  OBJECT_LITERAL
    NEW
      "list"
    "0"
    "64000"
khatchad commented 1 year ago

Attaching complete files. jython.txt jython3.txt

khatchad commented 1 year ago

Looks like Jython3 isn't loading the interpreter. I'm seeing these stack traces:

java.io.FileNotFoundException: src/resources/frozen_importlib/_frozen_importlib.class (No such file or directory)
    at java.base/java.io.FileInputStream.open0(Native Method)
    at java.base/java.io.FileInputStream.open(FileInputStream.java:216)
    at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
    at org.python.core.PySystemState.doInitialize(PySystemState.java:1152)
    at org.python.core.PySystemState.initialize(PySystemState.java:1007)
    at org.python.core.PySystemState.initialize(PySystemState.java:962)
    at org.python.core.PySystemState.initialize(PySystemState.java:957)
    at org.python.core.PySystemState.initialize(PySystemState.java:952)
    at org.python.core.PySystemState.initialize(PySystemState.java:948)
    at org.python.core.ThreadStateMapping.getThreadState(ThreadStateMapping.java:32)
    at org.python.core.Py.getThreadState(Py.java:1793)
    at org.python.core.Py.getThreadState(Py.java:1789)
    at org.python.core.Py.getSystemState(Py.java:1809)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:103)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:92)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:69)
    at com.ibm.wala.cast.python.util.Python3Interpreter.getInterp(Python3Interpreter.java:14)
    at com.ibm.wala.cast.python.loader.Python3Loader$4$1.eval(Python3Loader.java:78)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:44)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:39)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArray(CAstRewriter.java:142)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArrayAndTargets(CAstRewriter.java:149)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:77)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArray(CAstRewriter.java:142)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArrayAndTargets(CAstRewriter.java:149)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:77)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArray(CAstRewriter.java:142)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArrayAndTargets(CAstRewriter.java:149)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:77)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.rewrite(CAstRewriter.java:396)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.rewrite(CAstRewriter.java:454)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildren(CAstRewriter.java:368)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter$1.newChildren(CAstRewriter.java:441)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter$2.getAllScopedEntities(CAstRewriter.java:480)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter$2.getScopedEntities(CAstRewriter.java:470)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visit(CAstVisitor.java:983)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visit(CAstVisitor.java:673)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visitChildren(CAstVisitor.java:488)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visitAllChildren(CAstVisitor.java:497)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visit(CAstVisitor.java:568)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visitEntities(CAstVisitor.java:240)
    at com.ibm.wala.cast.ir.translator.ExposedNamesCollector.run(ExposedNamesCollector.java:73)
    at com.ibm.wala.cast.ir.translator.AstTranslator.translate(AstTranslator.java:5522)
    at com.ibm.wala.cast.ir.translator.AstTranslator.translate(AstTranslator.java:5517)
    at com.ibm.wala.cast.loader.CAstAbstractModuleLoader.init(CAstAbstractModuleLoader.java:129)
    at com.ibm.wala.cast.python.loader.PythonLoader.init(PythonLoader.java:58)
    at com.ibm.wala.cast.loader.SingleClassLoaderFactory.getLoader(SingleClassLoaderFactory.java:39)
    at com.ibm.wala.ipa.cha.ClassHierarchy.<init>(ClassHierarchy.java:270)
    at com.ibm.wala.ipa.cha.ClassHierarchy.<init>(ClassHierarchy.java:203)
    at com.ibm.wala.ipa.cha.SeqClassHierarchyFactory.make(SeqClassHierarchyFactory.java:52)
    at com.ibm.wala.cast.python.client.PythonAnalysisEngine.buildClassHierarchy(PythonAnalysisEngine.java:131)
    at com.ibm.wala.client.AbstractAnalysisEngine.defaultCallGraphBuilder(AbstractAnalysisEngine.java:278)
    at com.ibm.wala.cast.python.ml.test.TestPythonMLCallGraphShape.checkTensorOps(TestPythonMLCallGraphShape.java:100)
    at com.ibm.wala.cast.python.ml.test.TestNeuroImageExamples.testEx1CG(TestNeuroImageExamples.java:20)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
    at java.base/java.lang.reflect.Method.invoke(Method.java:577)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:529)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:756)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:452)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)

java.lang.NullPointerException: Cannot invoke "org.python.core.PyObject.invoke(String, org.python.core.PyObject, org.python.core.PyObject)" because "sys.importlib" is null
    at org.python.core.imp.import_next(imp.java:735)
    at org.python.core.imp.import_first(imp.java:770)
    at org.python.core.imp.load(imp.java:616)
    at org.python.core.Py.importSiteIfSelected(Py.java:1922)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:114)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:92)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:69)
    at com.ibm.wala.cast.python.util.Python3Interpreter.getInterp(Python3Interpreter.java:14)
    at com.ibm.wala.cast.python.loader.Python3Loader$4$1.eval(Python3Loader.java:78)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:44)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:39)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArray(CAstRewriter.java:142)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArrayAndTargets(CAstRewriter.java:149)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:77)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArray(CAstRewriter.java:142)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArrayAndTargets(CAstRewriter.java:149)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:77)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArray(CAstRewriter.java:142)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildrenArrayAndTargets(CAstRewriter.java:149)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:77)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(ConstantFoldingRewriter.java:23)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.rewrite(CAstRewriter.java:396)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.rewrite(CAstRewriter.java:454)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter.copyChildren(CAstRewriter.java:368)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter$1.newChildren(CAstRewriter.java:441)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter$2.getAllScopedEntities(CAstRewriter.java:480)
    at com.ibm.wala.cast.tree.rewrite.CAstRewriter$2.getScopedEntities(CAstRewriter.java:470)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visit(CAstVisitor.java:983)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visit(CAstVisitor.java:673)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visitChildren(CAstVisitor.java:488)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visitAllChildren(CAstVisitor.java:497)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visit(CAstVisitor.java:568)
    at com.ibm.wala.cast.tree.visit.CAstVisitor.visitEntities(CAstVisitor.java:240)
    at com.ibm.wala.cast.ir.translator.ExposedNamesCollector.run(ExposedNamesCollector.java:73)
    at com.ibm.wala.cast.ir.translator.AstTranslator.translate(AstTranslator.java:5522)
    at com.ibm.wala.cast.ir.translator.AstTranslator.translate(AstTranslator.java:5517)
    at com.ibm.wala.cast.loader.CAstAbstractModuleLoader.init(CAstAbstractModuleLoader.java:129)
    at com.ibm.wala.cast.python.loader.PythonLoader.init(PythonLoader.java:58)
    at com.ibm.wala.cast.loader.SingleClassLoaderFactory.getLoader(SingleClassLoaderFactory.java:39)
    at com.ibm.wala.ipa.cha.ClassHierarchy.<init>(ClassHierarchy.java:270)
    at com.ibm.wala.ipa.cha.ClassHierarchy.<init>(ClassHierarchy.java:203)
    at com.ibm.wala.ipa.cha.SeqClassHierarchyFactory.make(SeqClassHierarchyFactory.java:52)
    at com.ibm.wala.cast.python.client.PythonAnalysisEngine.buildClassHierarchy(PythonAnalysisEngine.java:131)
    at com.ibm.wala.client.AbstractAnalysisEngine.defaultCallGraphBuilder(AbstractAnalysisEngine.java:278)
    at com.ibm.wala.cast.python.ml.test.TestPythonMLCallGraphShape.checkTensorOps(TestPythonMLCallGraphShape.java:100)
    at com.ibm.wala.cast.python.ml.test.TestNeuroImageExamples.testEx1CG(TestNeuroImageExamples.java:20)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
    at java.base/java.lang.reflect.Method.invoke(Method.java:577)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:529)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:756)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:452)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)
khatchad commented 1 year ago

Looks like this file is missing from jython-dev.jar:

https://github.com/juliandolby/jython3/blob/d54c09336693dfee71db258ca56c7339b81b844d/src/org/python/core/PySystemState.java#L1152

It's in the repository, however: https://github.com/juliandolby/jython3/blob/master/src/resources/frozen_importlib/_frozen_importlib.class.

khatchad commented 1 year ago

Related to #33 and https://github.com/ponder-lab/ML/issues/4. With #33, perhaps the issue was not inconsistency but rather that Jython was superseding Jython3. With https://github.com/ponder-lab/ML/issues/4, parsing f-strings requires using Jython3 (f-strings are part of Python 3). But, Jython 3 is not yet working, hence we have #42. The conclusion here for now is that http://github.com/wala/ML currently only supports Python 2 (see https://github.com/wala/ML/commit/b4b6e8e5d2fa378a12b3a8934b2b49f9948af905).

Unfortunately, https://github.com/wala/ML/tree/master is in a bit of an inconsistent state at the moment; it can now handle Python 3 parsing (including f-strings) but, since the Python 3 interpreter isn't working, constant propagation isn't working, which means that tensor shape analysis may be inaccurate.

khatchad commented 1 year ago

BTW, https://github.com/ponder-lab/ML/issues/4 is an issue in our fork. At the time, it was created due to a build failure, and the upstream repo did not yet have a build.