JetBrains / compose-multiplatform

Compose Multiplatform, a modern UI framework for Kotlin that makes building performant and beautiful user interfaces easy and enjoyable.
https://jetbrains.com/lp/compose-multiplatform
Apache License 2.0
16k stars 1.16k forks source link

IllegalStateException accessing state in non-AWT thread #4546

Closed mgroth0 closed 5 months ago

mgroth0 commented 6 months ago

Describe the bug

It is unclear how robustly Compose for Desktop allows for accessing state from non-UI threads. Often, it just works. But other times, it does not.

The compose snapshot system is supposed to be thread-safe. However, Swing/AWT is designed for all UI operations to occur on a single thread. My abstract understanding, which may be flawed, is that Desktop Compose creates a "safety layer" so that users can access state from any thread, and the underlying library will ensure that all AWT/Swing operations occur on the UI thread. I cannot find documentation on this, but this is how it seems to work 90-99% of the time. Occasionally, however, it just doesn't work and we get an exception like:

Click here to expand stack trace

``` Caused by: java.lang.IllegalStateException: Method should be called from AWT event dispatch thread at org.jetbrains.skiko.SkiaLayer.needRedraw(SkiaLayer.awt.kt:517)at androidx.compose.ui.scene.skia.WindowSkiaLayerComponent.onComposeInvalidation(WindowSkiaLayerComponent.desktop.kt:110) at androidx.compose.ui.scene.ComposeSceneMediator.onComposeInvalidation(ComposeSceneMediator.desktop.kt:465) at androidx.compose.ui.scene.ComposeContainer$createComposeScene$1.invoke(ComposeContainer.desktop.kt:254) at androidx.compose.ui.scene.ComposeContainer$createComposeScene$1.invoke(ComposeContainer.desktop.kt:254) at androidx.compose.ui.scene.BaseComposeScene.invalidateIfNeeded(BaseComposeScene.skiko.kt:100) at androidx.compose.ui.scene.BaseComposeScene$snapshotInvalidationTracker$1.invoke(BaseComposeScene.skiko.kt:57) at androidx.compose.ui.scene.BaseComposeScene$snapshotInvalidationTracker$1.invoke(BaseComposeScene.skiko.kt:57) at androidx.compose.ui.node.CommandList.add(SnapshotInvalidationTracker.skiko.kt:130) at androidx.compose.ui.node.SnapshotInvalidationTracker$snapshotObserver$1.invoke(SnapshotInvalidationTracker.skiko.kt:75) at androidx.compose.ui.node.SnapshotInvalidationTracker$snapshotObserver$1.invoke(SnapshotInvalidationTracker.skiko.kt:71)at androidx.compose.runtime.snapshots.SnapshotStateObserver.sendNotifications(SnapshotStateObserver.kt:83) at androidx.compose.runtime.snapshots.SnapshotStateObserver.access$sendNotifications(SnapshotStateObserver.kt:43) at androidx.compose.runtime.snapshots.SnapshotStateObserver$applyObserver$1.invoke(SnapshotStateObserver.kt:50) at androidx.compose.runtime.snapshots.SnapshotStateObserver$applyObserver$1.invoke(SnapshotStateObserver.kt:48) at androidx.compose.runtime.snapshots.SnapshotKt.advanceGlobalSnapshot(Snapshot.kt:1816) at androidx.compose.runtime.snapshots.SnapshotKt.advanceGlobalSnapshot(Snapshot.kt:1831) at androidx.compose.runtime.snapshots.SnapshotKt.access$advanceGlobalSnapshot(Snapshot.kt:1) at androidx.compose.runtime.snapshots.GlobalSnapshot.notifyObjectsInitialized$runtime(Snapshot.kt:1360) at androidx.compose.runtime.snapshots.Snapshot$Companion.notifyObjectsInitialized(Snapshot.kt:567) at androidx.compose.runtime.DerivedSnapshotState.currentRecord(DerivedState.kt:246)at androidx.compose.runtime.DerivedSnapshotState.getValue(DerivedState.kt:269) at matt.launch.proctool.ProcessToolCli$run$1$1$1$1.invokeSuspend(proctool.kt:141) at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104) at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:111) at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:99) at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584) at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:811) at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:715) at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:702) ```

This stack trace is from a real application, not a test or a reproducer. I have spent a long time trying to create a reproducer, but I have to give up because I just cannot cause this error to happen again.

If I keep running the application and trying to repeat the steps that caused the above error, I cannot reproduce it.

Furthermore, I have tried to reproduce this by creating a minimal reproducer. I think that this error is caused by something having to do with accessing state from a different thread, so I tried to create a reproducer for this.

Click here to see (failed) reproducer code

```kotlin import androidx.compose.foundation.layout.Column import androidx.compose.material3.Button import androidx.compose.material3.Text import androidx.compose.runtime.LaunchedEffect import androidx.compose.runtime.derivedStateOf import androidx.compose.runtime.getValue import androidx.compose.runtime.mutableStateOf import androidx.compose.runtime.setValue import androidx.compose.ui.window.Window import androidx.compose.ui.window.application import kotlinx.coroutines.Dispatchers import kotlinx.coroutines.delay import kotlinx.coroutines.launch import kotlinx.coroutines.withContext import kotlin.random.Random import kotlin.time.Duration.Companion.milliseconds fun main() { var number by mutableStateOf(0) var number2 by mutableStateOf(500) val text by derivedStateOf { number2++ number.toString() } val text2 by derivedStateOf { number2.toString() } var jitterBase by mutableStateOf(null) val slowDerived by derivedStateOf { Thread.sleep(200) number.toString() } var slowDerivedDest by mutableStateOf("") application { Window( onCloseRequest = ::exitApplication ) { LaunchedEffect(Unit) { withContext(Dispatchers.IO) { launch { while (true) { delay(100.milliseconds) number = text.toInt() + 1 jitterBase?.let { window.setLocation(it + Random.nextInt(-5, 5), window.y) } } } launch { while (true) { delay(100.milliseconds) slowDerivedDest = slowDerived } } } } Column { Text(text) Text(text2) Button( onClick = { jitterBase = if (jitterBase == null) window.x else null } ) { Text("Toggle Jitter") } Text(slowDerivedDest) } } } } ```

This code failed to reproduce the bug. Even though I modify UI state at a very fast rate from multiple non-UI threads, I get no exceptions. Therefore, this is a deeper issue. Whatever caused the above exception is one issue, but possibly a smaller and more contained one. A possibly more meta issue here is that Desktop Compose users cannot predict when accessing state from non-UI threads is bad and when it is not.

I have tried to dig into the machinery. I read through some of the implementations along the stack trace in the exception above. I also tried to use a debugger. I was able to gather a bit of hopefully useful information, but it is not much.

I learned that a lot of the code in the deepest layers (SwingSkiaLayerComponent, ComposeSceneMediator, for example) seems to usually run at regular intervals to render frames. This frame rendering always occurs on the UI thread.

However, the compose notification machinery somehow allows non-UI threads to "leak" into these same operations which usually only run in the UI frame-rendering loop.

One potentially important observation I made is that in my reproducer, when I placed a breakpoint in Snapshot.kt#advanceGlobalSnapshot at the line that says val observers = applyObservers, I found that this line never executed in my reproducer. This means that whenever the global snapshot was advanced in my reproducer, the global snapshot had not been modified. If you look closely at the stacktrace from when the exception was thrown, you will see that the code entered this block, and therefore the global snapshot had been modified. I could not figure out how to reproduce the particular situation of calling "advanceGlobalSnapshot" with a modified global snapshot state, but perhaps this has something to do with the issue.

Another potentially important observation I made is that the error "Method should be called from AWT event dispatch thread" is called from skiko code, not on awt code. SkiaLayer.needsRedraw is the precise method that threw an exception. I was curious about exactly what UI operation was being done if the thread were to continue, so I looked into Redrawer.needRedraw. needRedraw has many implementations, but I checked MetalRedrawer just as an example. I see it calls frameDispatcher.scheduleFrame, and all that this scheduleFrame function does is send a signal to another coroutine which likely is drawing frames in the proper frame.

Therefore, if my understanding is correct, at least for the MetalRedrawer implementation the exception thrown from SkiaLayer.needsRedraw probably isn't even neccesarry! Maybe the exception can just be removed? Of course, I suppose it must have been added for a reason so maybe there are other implementations of Redrawer that actually must call needRedraw on the AWT thread. But if that is the case, I wonder if the solution here could possibly be to just ensure that every implementation of Redrawer is safe to call needRedraw from any thread, as it seems to be with MetalRedrawer.

To conclude: It is very unclear and unpredictable when and where it is safe to modify UI state from different threads. It is impractical to ensure every single UI state access happens from a UI thread, especially when we are at the same time trying so hard to keep heavy work off the UI thread so that the UI runs smooth. The fact that accessing UI state from non-UI threads works 95-99% of the time seems to imply it should be 100%. Otherwise, if the design decision for this library is that it will never be fully safe to access UI state from non-UI threads, then maybe this library should strictly enforce that everywhere by throwing exceptions such that my reproducer above does fail. Another possible approach towards solving this is to implement more fail-fast behavior so that if errors like the one I had ever pop up, there is a more clear cause.

Affected platforms Deskop

Versions

Expected behavior

m-sasha commented 6 months ago

I think this is just a bug; there's no deep issue here.

BaseComposeScene.invalidate is not guaranteed to run on the main thread, therefore ComposeSceneMediator.onComposeInvalidation needs to check and schedule its code to run on the main thread if the current thread is different.

@igordmn ?

alexzhirkevich commented 6 months ago

Same happens here - #4472

okushnikov commented 2 months ago

Please check the following ticket on YouTrack for follow-ups to this issue. GitHub issues will be closed in the coming weeks.