eclipse-sirius / sirius-web

Sirius Web: open-source low-code platform to define custom web applications supporting your specific visual languages
https://eclipse.dev/sirius/sirius-web.html
Eclipse Public License 2.0
79 stars 52 forks source link

Some issue may prevent the disposal of some representation events processors #1911

Open sbegaudeau opened 1 year ago

sbegaudeau commented 1 year ago

While playing with a view based diagram description and a project in which the diagram description is used, I have succeeded in creating this exception:

2023-04-12 16:55:44.177  WARN 56359 --- [pool-2-thread-1] o.e.s.c.g.w.h.StartMessageHandler        : java.io.IOException: The current thread was interrupted

java.io.IOException: java.io.IOException: The current thread was interrupted
    at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:327) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:254) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendPartialString(WsRemoteEndpointImplBase.java:227) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    at org.apache.tomcat.websocket.WsRemoteEndpointBasic.sendText(WsRemoteEndpointBasic.java:49) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    at org.springframework.web.socket.adapter.standard.StandardWebSocketSession.sendTextMessage(StandardWebSocketSession.java:215) ~[spring-websocket-5.3.23.jar:5.3.23]
    at org.springframework.web.socket.adapter.AbstractWebSocketSession.sendMessage(AbstractWebSocketSession.java:106) ~[spring-websocket-5.3.23.jar:5.3.23]
    at org.eclipse.sirius.components.graphql.ws.handlers.IWebSocketMessageHandler.send(IWebSocketMessageHandler.java:38) ~[sirius-components-graphql-2023.3.2.jar:2023.3.2]
    at org.eclipse.sirius.components.graphql.ws.handlers.StartMessageHandler.lambda$subscribe$0(StartMessageHandler.java:109) ~[sirius-components-graphql-2023.3.2.jar:2023.3.2]
    at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:160) ~[reactor-core-3.4.24.jar:3.4.24]
    at graphql.execution.reactive.CompletionStageMappingPublisher$CompletionStageSubscriber.lambda$whenNextFinished$0(CompletionStageMappingPublisher.java:97) ~[graphql-java-18.3.jar:na]
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[na:na]
    at java.base/java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:887) ~[na:na]
    at java.base/java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2357) ~[na:na]
    at java.base/java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:144) ~[na:na]
    at graphql.execution.reactive.CompletionStageMappingPublisher$CompletionStageSubscriber.onNext(CompletionStageMappingPublisher.java:85) ~[graphql-java-18.3.jar:na]
    at reactor.core.publisher.StrictSubscriber.onNext(StrictSubscriber.java:89) ~[reactor-core-3.4.24.jar:3.4.24]
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129) ~[reactor-core-3.4.24.jar:3.4.24]
    at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.runAsync(FluxPublishOn.java:440) ~[reactor-core-3.4.24.jar:3.4.24]
    at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.run(FluxPublishOn.java:527) ~[reactor-core-3.4.24.jar:3.4.24]
    at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84) ~[reactor-core-3.4.24.jar:3.4.24]
    at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37) ~[reactor-core-3.4.24.jar:3.4.24]
    at org.springframework.security.concurrent.DelegatingSecurityContextCallable.call(DelegatingSecurityContextCallable.java:84) ~[spring-security-core-5.7.4.jar:5.7.4]
    at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:317) ~[na:na]
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:1623) ~[na:na]
Caused by: java.io.IOException: The current thread was interrupted
    at org.apache.tomcat.util.net.NioChannel.checkInterruptStatus(NioChannel.java:236) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.util.net.NioChannel.write(NioChannel.java:146) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper$NioOperationState.run(NioEndpoint.java:1680) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.util.net.SocketWrapperBase$OperationState.start(SocketWrapperBase.java:1070) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.util.net.SocketWrapperBase.vectoredOperation(SocketWrapperBase.java:1489) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:1415) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:1386) ~[tomcat-embed-core-9.0.68.jar:9.0.68]
    at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:93) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:512) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:314) ~[tomcat-embed-websocket-9.0.68.jar:9.0.68]
    ... 26 common frames omitted
sbegaudeau commented 1 year ago

After this exception, some representations are not disposed anymore since the thread which would have disposed them is now dead and thus the editing context event processor is never disposed. As a consequence, the editing context event processor will never see any update in the view / domain models on the server.

pcdavid commented 1 year ago

I can't always reproduce, but with the following steps I can get the stack quite often still:

  1. Start from an empty server (probably not necessary, but easier to analyze).
  2. Import created-before.zip. Its a basic project with a "Robot Flow". Select the "Robot" element and note that all the expected representations can be created on it.
  3. Go back to the projects list (do not keep "created-before" open in a tab/window) and import studio.zip. It contains a View with an empty diagram definition which applies to flow::System.
  4. Go back to the projects list and import created-after.zip. Except for the name, it's identical to "created-before" and just provided for convenience.
  5. In "created-after", select the "Robot" system element. The "NewDiag" (defined in the studio) should be available. Create it. Sometimes the stack appears then, but most of the times I tested, there was still no error at this point.
  6. Go back to the projects list and re-open "created-before" (it should have been closed for at least 6 seconds at this point, the longer the better).
  7. Select the "Robot" system element. The "NewDiag" (defined in the studio) should be available. Create it. Here I get the stack quite regularly, but not always.
pcdavid commented 1 year ago

I can sometimes get in a state where, a Tree (from the explorer) is not properly "unsubscribed" from the subscription manager even though I (the end user) have closed the project. This causes the tree event processor to never get disposed, and by cascade the editing context event processor itself is never disposed/closed, and hence never properly reloaded.

This can happen without the exception above even occuring. I'm not sure yet, but maybe the exception is a symptom after the fact, and the root cause occurs before.

pcdavid commented 1 year ago

I think I have reproducible steps for the stack trace, but it seems to be a different case than the one from my previous comment (where an editing context is never released/disposed, and thus never reloaded).

  1. Create a project which takes a significant time to load. For example create 3 or 4 "Big Guy" flow models in it.
  2. Go back to the projects list, and wait for it to be released.
  3. Click to re-open it, and as soon as the project edition page becomes visible (with the explorer still empty as the editing context has not yet loaded), go back to the project list.

This systematically reproduces the stack trace for me, but still the project/editing context gets released properly after a while.