ponder-lab / Optimize-Java-8-Streams-Refactoring

Refactorings for optimizing Java 8 stream client code for greater parallelism and efficiency.
http://cuny.is/streams
Eclipse Public License 1.0
8 stars 7 forks source link

False positive regarding missing terminal operation when a stream is returned from a method #103

Closed khatchad closed 6 years ago

khatchad commented 6 years ago

The tool erroneously reports that a stream lacks a terminal operation when one exists. The situation is where the stream is returned from a method. The caller invokes a terminal operation on the returned stream.

-- Error Log from JUnit --
Class: edu.cuny.hunter.streamrefactoring.ui.tests.ConvertStreamToParallelRefactoringTest
Method: testNonInternalAPI
Actual: null
Expected: null
Stack Trace:
junit.framework.AssertionFailedError: expected:<[15]> but was:<[13]>
    at junit.framework.Assert.fail(Assert.java:57)
    at junit.framework.Assert.failNotEquals(Assert.java:329)
    at junit.framework.Assert.assertEquals(Assert.java:78)
    at junit.framework.Assert.assertEquals(Assert.java:86)
    at junit.framework.TestCase.assertEquals(TestCase.java:253)
    at edu.cuny.hunter.streamrefactoring.ui.tests.ConvertStreamToParallelRefactoringTest.helper(ConvertStreamToParallelRefactoringTest.java:302)
    at edu.cuny.hunter.streamrefactoring.ui.tests.ConvertStreamToParallelRefactoringTest.testNonInternalAPI(ConvertStreamToParallelRefactoringTest.java:431)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at junit.framework.TestCase.runTest(TestCase.java:176)
    at junit.framework.TestCase.runBare(TestCase.java:141)
    at junit.framework.TestResult$1.protect(TestResult.java:122)
    at junit.framework.TestResult.runProtected(TestResult.java:142)
    at junit.framework.TestResult.run(TestResult.java:125)
    at junit.framework.TestCase.run(TestCase.java:129)
    at junit.extensions.TestDecorator.basicRun(TestDecorator.java:23)
    at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
    at junit.framework.TestResult.runProtected(TestResult.java:142)
    at junit.extensions.TestSetup.run(TestSetup.java:27)
    at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:121)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:539)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:761)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:461)
    at org.eclipse.pde.internal.junit.runtime.RemotePluginTestRunner.main(RemotePluginTestRunner.java:181)
    at org.eclipse.pde.internal.junit.runtime.PlatformUITestHarness.lambda$0(PlatformUITestHarness.java:43)
    at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:37)
    at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:182)
    at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:4497)
    at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:4110)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine$5.run(PartRenderingEngine.java:1150)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:336)
    at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1039)
    at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:153)
    at org.eclipse.ui.internal.Workbench.lambda$3(Workbench.java:680)
    at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:336)
    at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:594)
    at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:148)
    at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:151)
    at org.eclipse.pde.internal.junit.runtime.NonUIThreadTestApplication.runApp(NonUIThreadTestApplication.java:52)
    at org.eclipse.pde.internal.junit.runtime.UITestApplication.runApp(UITestApplication.java:43)
    at org.eclipse.pde.internal.junit.runtime.NonUIThreadTestApplication.start(NonUIThreadTestApplication.java:46)
    at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
    at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:388)
    at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:243)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:653)
    at org.eclipse.equinox.launcher.Main.basicRun(Main.java:590)
    at org.eclipse.equinox.launcher.Main.run(Main.java:1499)
    at org.eclipse.equinox.launcher.Main.main(Main.java:1472)
khatchad commented 6 years ago

This may have something to do with setting the entry point method. Currently, it's set to the enclosing method, i.e., the method obtaining a stream from the Stream API. I'm unsure if that is correct.

khatchad commented 6 years ago

Another option would be to stick with the intraprocedural analysis (limit it only to the current method) but somehow not fail when the stream "escapes" the current method. Escape analysis may be useful. WALA also supports (seemingly) escape analysis (http://wala.sourceforge.net/javadocs/trunk/com/ibm/wala/escape/package-summary.html).

khatchad commented 6 years ago

Doesn't seem related to the lack of a correct entry point.

khatchad commented 6 years ago

Related to #100

khatchad commented 6 years ago

Currently, we cannot tell when either there is no terminal operation or the method just ends. It might have something to do with just analyzing the declaring method for terminal operations. We probably have to be smarter.

In fact, I am wondering if other attributes, like execution mode, span multiple methods.

If we must lose track on the stream once it leaves the method, we can use escape analysis to figure out when that happens and fail with a special precondition. That way, we can distinguish the situations.

yiming-tang-cs commented 6 years ago

We have 2 options to fix this problem.

Op1. Fix it and let the test cases pass. This is my aim. I divide my tasks into 2 parts. Firstly, I use Escape analysis to find which method has a reference that will escape. Then, if I find the method, I let the project ignore the process of requiring terminal operations.

OP2. Fail the test cases with a suitable reason. The users should know the failing reason is that a stream is returned from a method instead of requiring a terminal operation. We need to use Escape analysis to check whether the method has a reference that will escape and if yes, throw an exception.

khatchad commented 6 years ago
khatchad commented 6 years ago

As discussed, the implementation details of op1 below are not yet know. We need to investigate the issue more (see the checkboxes).

On Thu, 2017-11-30 at 16:24 +0000, Grace Tang wrote:

We have 2 options to fix this problem.

Op1. Fix it and let the test cases pass. This is my aim. I divide my tasks into 2 parts. Firstly, I use Escape analysis to find which method has a reference that will escape. Then, if I find the method, I let the project ignore the process of requiring terminal operations.

OP2. Fail the test cases with a suitable reason. The users should know the failing reason is that a stream is returned from a method instead of requiring a terminal operation. We need to use Escape analysis to check whether the method has a reference that will escape and if yes, throw an exception.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-348240380, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP8dy5N8uugZ0kMj4F60bu5p7_5ziks5s7taKgaJpZM4QeCIa.

khatchad commented 6 years ago

img_20171130_122031146

khatchad commented 6 years ago

@saledouble When do you expect to have the test cases ready (see https://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-348246469)?

yiming-tang-cs commented 6 years ago

Did you mean "Make test cases exploring several combinations of terminal and intermediate operations. "?

khatchad commented 6 years ago

Yes. Basically, is the problem only for terminal operations or do intermediate operations also have the same problem?

yiming-tang-cs commented 6 years ago

I can give you today. Please give me some time to finish the test case for no Entry Point now.

khatchad commented 6 years ago

Test case for what? What is no entry point? You mean #119? If so, please get to that one only if you have time. It's a lower priority.

yiming-tang-cs commented 6 years ago

As for the test case testNonInternalAPI, how many instance of Stream should be created? It seems that there is one instance is created now and its enclosingMethodDeclaration is m() where the stream is created.

khatchad commented 6 years ago

In general intermediate operations return new streams (see https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#StreamOps). But, some may return the same stream with modifications, e.g., see https://docs.oracle.com/javase/8/docs/api/java/util/stream/BaseStream.html#parallel--.

On Fri, 2017-12-01 at 18:52 +0000, Grace Tang wrote:

As for the test case testNonInternalAPI, how many instance of Stream should be created? It seems that there is one instance is created now and its enclosingMethodDeclaration is m() where the stream is created.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-348577644, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP9zfR096ILdXpmaySWVi25BsIkbgks5s8EsJgaJpZM4QeCIa.

khatchad commented 6 years ago

Again, please be specific. Do you mean instances of edu.cuny.hunter.streamrefactoring.core.analysis.Stream?

On Fri, 2017-12-01 at 18:52 +0000, Grace Tang wrote:

As for the test case testNonInternalAPI, how many instance of Stream should be created? It seems that there is one instance is created now and its enclosingMethodDeclaration is m() where the stream is created.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-348577644, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP9zfR096ILdXpmaySWVi25BsIkbgks5s8EsJgaJpZM4QeCIa.

--

Raffi Khatchadourian Assistant Professor, Computer Science, Hunter College Doctoral Faculty of the Graduate School and University Center's PhD Program in Computer Science City University of New York 695 Park Avenue Room 1090-H, Hunter North Building New York, NY 10065 P: (212) 650-3988 F: (212) 772-5219 raffi.khatchadourian@hunter.cuny.edumailto:raffi.khatchadourian@hunter.cuny.edu http://cuny.is/khatchad

yiming-tang-cs commented 6 years ago

Again, please be specific. Do you mean instances of edu.cuny.hunter.streamrefactoring.core.analysis.Stream?

Yes. Sorry.

khatchad commented 6 years ago

My guess is that the problem has something to do with this: https://github.com/ponder-lab/Java-8-Stream-Refactoring/blob/25f026c52309d267cc21fce66bfffe99c45153c6/edu.cuny.hunter.streamrefactoring.core/src/edu/cuny/hunter/streamrefactoring/core/analysis/StreamStateMachine.java#L280-L284

Perhaps we need to be examining terminal operations that are outside the enclosing method?

yiming-tang-cs commented 6 years ago

getEnclosingMethodNode(); can return the method node for the method where the stream is created.

The example:

class A {

    Stream<Object> m() {
        Stream<Object> stream = new ArrayList<>().parallelStream().sorted();
        return stream;
    }

    @EntryPoint
    void n() {
        Stream<Object> s = m();
        s.distinct().count();
    }
}

The result : image

Perhaps we need to be examining terminal operations that are outside the enclosing method?

Yes. I think not only for terminal operations, but also for terminal operations and intermediate operations outside the enclosing method.

khatchad commented 6 years ago

One thing we could do is instead of just examining call sites in the enclosing method node, we could examine call sites for all nodes in the graph.

khatchad commented 6 years ago

Or we can be smarter about it and use the predecessors and successors of the enclosing method node. But, the problem there is that the stream may be stored as a field.

khatchad commented 6 years ago

Yes but the code in question is specifically dealing with terminal operations. I am wondering if fixing just the terminal operations will suffice.

Raffi Khatchadourian Assistant Professor, Computer Science, Hunter College Doctoral Faculty of the Graduate School and University Center's PhD Program in Computer Science City University of New York 695 Park Avenue Room HN 1090-H New York, NY 10065 P: (212) 650-3988 F: (212) 772-5219 raffi.khatchadourian@hunter.cuny.edu http://cuny.is/khatchad

On Dec 1, 2017 6:29 PM, Grace Tang notifications@github.com wrote:

getEnclosingMethodNode(); can return the method node for the method where the stream is created.

The example:

class A {

    Stream<Object> m() {
            Stream<Object> stream = new ArrayList<>().parallelStream().sorted();
            return stream;
    }

    @EntryPoint
    void n() {
            Stream<Object> s = m();
            s.distinct().count();
    }

}

The result : [image]https://user-images.githubusercontent.com/10117031/33507881-455ae3d8-d6c5-11e7-8cdf-b5fd74c8569b.png

Perhaps we need to be examining terminal operations that are outside the enclosing method?

Yes. I think not only for terminal operations, but also for terminal operations and intermediate operations outside the enclosing method.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-348641620, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP1uvk1SapJOYkspKeLnVTmF_gqjdks5s8IvNgaJpZM4QeCIa.

yiming-tang-cs commented 6 years ago

Sorry, I was struggling in homework in this weekend.

One thing we could do is instead of just examining call sites in the enclosing method node, we could examine call sites for all nodes in the graph.

This one can work. I added the call sites in the method with terminal operations and the terminal operations can be found. I need to add all to see what happens.

khatchad commented 6 years ago

It is begining to occur to me that perhaps we have not covered call graphs adequately. Please read the following background links:

  1. https://en.wikipedia.org/wiki/Call_graph
  2. https://github.com/wala/WALA/wiki/Call-Graph
khatchad commented 6 years ago

Still need perf improvements (slicing).

khatchad commented 6 years ago

Looks like we have currently added at least 30 minutes to the tests: master Merge branch 'saledouble-issue_103_new' Raffi Khatchadourian avatar Raffi Khatchadourian

196 passed

6b0061c 1 hr 20 min 50 sec 3 days ago

master Merge pull request #123 from ponder-lab/issue_120 Raffi Khatchadourian avatar Raffi Khatchadourian

187 passed

25f026c 43 min 57 sec 18 days ago

khatchad commented 6 years ago

Though, we did add a few more tests.

yiming-tang-cs commented 6 years ago

Still need perf improvements (slicing).

Sorry for the late reply. The code below may work.

if (node.getMethod() instanceof SyntheticMethod || node.getMethod() instanceof ShrikeCTMethod) 
yiming-tang-cs commented 6 years ago

Sorry, I should look into slicing API as you mentioned in Asana. I read some descriptions of slicing API and find that doling slicing is more complicated than what I wrote above. As such, why do you choose scling API?

khatchad commented 6 years ago

How does that code help us?

khatchad commented 6 years ago
  1. Why do you think it is complicated?
  2. Can you share the link to the documentation you read?
  3. Why stray away from something just because it is complicated? Do you have a better solution?
  4. We should try to use the slicing API because we need to slice the graph.
yiming-tang-cs commented 6 years ago

Why do you think it is complicated? Can you share the link to the documentation you read?

http://wala.sourceforge.net/wiki/index.php/UserGuide:Slicer . I was reading the code example proved in the link. The method about slice has several operations and additional space to store the collection of statement but the code I wrote above just has one line and no extra space. That is why I think it is more complicated than the code I proved above.

yiming-tang-cs commented 6 years ago

Do you have a better solution?

Can we add

if (node.getMethod() instanceof SyntheticMethod || node.getMethod() instanceof ShrikeCTMethod) 

at the beginning of iteration for cgNodes?

yiming-tang-cs commented 6 years ago

If yes, I skip the loop. If no, I do nothing.

yiming-tang-cs commented 6 years ago

Why stray away from something just because it is complicated?

I mean it is more complicated than the code I wrote. It may not complicated.

khatchad commented 6 years ago

Please use the wiki on the GitHub site. The sourceforge one may be old.

On Tue, 2017-12-19 at 16:50 +0000, Grace Tang wrote:

Why do you think it is complicated? Can you share the link to the documentation you read?

http://wala.sourceforge.net/wiki/index.php/UserGuide:Slicer . I was reading the code example proved in the link. The method about slice has several operations and additional space to store the collection of statement but the code I wrote above just has one line and no extra space. That is why I think it is more complicated than the code I proved above.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352817955, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP8e1-MGBKxjvBauQvj6z5NXHyoljks5tB-lJgaJpZM4QeCIa.

khatchad commented 6 years ago

But you did not provide an explanation on how the code you wrote helps with our problem.

On Tue, 2017-12-19 at 16:50 +0000, Grace Tang wrote:

Why do you think it is complicated? Can you share the link to the documentation you read?

http://wala.sourceforge.net/wiki/index.php/UserGuide:Slicer . I was reading the code example proved in the link. The method about slice has several operations and additional space to store the collection of statement but the code I wrote above just has one line and no extra space. That is why I think it is more complicated than the code I proved above.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352817955, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP8e1-MGBKxjvBauQvj6z5NXHyoljks5tB-lJgaJpZM4QeCIa.

khatchad commented 6 years ago

Please list the GitHub wiki link here. Thanks.

On Tue, 2017-12-19 at 16:50 +0000, Grace Tang wrote:

Why do you think it is complicated? Can you share the link to the documentation you read?

http://wala.sourceforge.net/wiki/index.php/UserGuide:Slicer . I was reading the code example proved in the link. The method about slice has several operations and additional space to store the collection of statement but the code I wrote above just has one line and no extra space. That is why I think it is more complicated than the code I proved above.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352817955, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP8e1-MGBKxjvBauQvj6z5NXHyoljks5tB-lJgaJpZM4QeCIa.

khatchad commented 6 years ago

How does this help?

On Tue, 2017-12-19 at 08:51 -0800, Grace Tang wrote:

Do you have a better solution?

Can we add

if (node.getMethod() instanceof SyntheticMethod || node.getMethod() instanceof ShrikeCTMethod)

at the beginning of iteration for cgNodes?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352818367, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP4WIk3OtQ8v4jLYVQU8ys8qX1Fukks5tB-mWgaJpZM4QeCIa.

khatchad commented 6 years ago

But what does it mean when the answer is "yes" and what does it mean when the answer is "no?"

On Tue, 2017-12-19 at 16:52 +0000, Grace Tang wrote:

If yes, I skip the loop. If no, I do nothing.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352818880, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP_RFQQMKpyrc83YvxbvKUoDRWBjsks5tB-nqgaJpZM4QeCIa.

yiming-tang-cs commented 6 years ago

https://github.com/wala/WALA/wiki/Slicer

khatchad commented 6 years ago

Perfect!

On Tue, 2017-12-19 at 09:47 -0800, Grace Tang wrote:

https://github.com/wala/WALA/wiki/Slicer

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352834407, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DPyLwaV70Jn6nyKJb1vcCUfczJVdrks5tB_bCgaJpZM4QeCIa.

khatchad commented 6 years ago

So, since we are only slicing the graph according to client and library code, you're right, we are not going to need all of the features of this API right now.

yiming-tang-cs commented 6 years ago

So, since we are only slicing the graph according to client and library code, you're right, we are not going to need all of the features of this API right now.

We still need this API?

khatchad commented 6 years ago

@saledouble I like your initial idea in https://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352618358, however, I cannot understand your condition. To me, the condition doesn't say anything about client and library code, but perhaps I am missing an explanation.

khatchad commented 6 years ago

Basically, for now, we can just filter the nodes we consider based on their package names. In this case, you want to explore the API of the call graph node to see if there is a way to get the package name of the declaring class of the method it represents. Make sense?

khatchad commented 6 years ago

We still need this API?

You need to use the call graph API for now.

khatchad commented 6 years ago

This API http://wala.sourceforge.net/javadocs/trunk/com/ibm/wala/ipa/callgraph/CGNode.html#getMethod() looks interesting.

On Tue, 2017-12-19 at 10:03 -0800, Grace Tang wrote:

So, since we are only slicing the graph according to client and library code, you're right, we are not going to need all of the features of this API right now.

We still need this API?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/ponder-lab/Java-8-Stream-Refactoring/issues/103#issuecomment-352838779, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB9DP2lc9kc8OQ0vtD-oDqoclJcQMgrNks5tB_qJgaJpZM4QeCIa.

yiming-tang-cs commented 6 years ago

I find the condition from: https://github.com/wala/WALA/blob/d29e278291dc0b93db6ef128291c1d19cf6c8079/com.ibm.wala.core/src/com/ibm/wala/ipa/callgraph/propagation/cfa/FallbackContextInterpreter.java#L63-L74

I searched it last night, but I did not find to many information about this. I find a book Aliasing in Object-Oriented Programming mentioned the synthetic method and how it works. " Note that an entrypoint method may rely on initialization being performed before it is invoked, e.g., the creation of the String[] array parameter for a main method. In WALA [69], such behavior is modeled in a synthetic “fake root method” that serves as a single root for the call graph and contains invocations of the real entrypoints."

There are more than 9000 cgNodes in our project, so I cannot check them one by one. But most of them have ShrikeCTMethod.