Open mvicsokolova opened 1 year ago
Your question looks like a false dichotomy. May we improve the current text format now and introduce new ones in the future?
In your example, CoroutineDumpFormat.JSON
should not be an enumeration, but an implementation of an interface (similar to Dispather.IO
), in such case we don't have to define and provide now a dump "JEP 425 compatible", and we leave open the door for third-party implementations (text+ASCII color, YAML format, binary format...).
Thank you for your comment! That’s true, just improving text format of the dump is a possible option.
Do you have any suggestions on what may make the text format more readable and convenient for troubleshooting?
The main (ignored) issue for me is the duplication on a large dump. The coroutine's main goal is the ability to run thousands (or millions) of coroutines, unfortunately reading thousands (or millions) of stack traces is really demanding. Moreover, many of these are identical and are duplicated many times.
See this trivial example:
Coroutines dump 2023/01/18 09:52:01
Coroutine BlockingCoroutine{Active}@68ff4cf8, state: RUNNING
at java.base/java.lang.Thread.getStackTrace(Thread.java:1602)
at kotlinx.coroutines.debug.internal.DebugProbesImpl.enhanceStackTraceWithThreadDumpImpl(DebugProbesImpl.kt:339)
at kotlinx.coroutines.debug.internal.DebugProbesImpl.dumpCoroutinesSynchronized(DebugProbesImpl.kt:294)
at kotlinx.coroutines.debug.internal.DebugProbesImpl.dumpCoroutines(DebugProbesImpl.kt:266)
at kotlinx.coroutines.debug.DebugProbes.dumpCoroutines(DebugProbes.kt:144)
at kotlinx.coroutines.debug.DebugProbes.dumpCoroutines$default(DebugProbes.kt:144)
at MainKt$main$1.invokeSuspend(Main.kt:21)
(Coroutine creation stacktrace)
at kotlin.coroutines.intrinsics.IntrinsicsKt__IntrinsicsJvmKt.createCoroutineUnintercepted(IntrinsicsJvm.kt:122)
at kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:58)
at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
at MainKt.main(Main.kt:14)
at MainKt.main(Main.kt)
Coroutine StandaloneCoroutine{Active}@2fd9881b, state: SUSPENDED
at MainKt$main$1$1$1.invokeSuspend(Main.kt:18)
(Coroutine creation stacktrace)
at kotlin.coroutines.intrinsics.IntrinsicsKt__IntrinsicsJvmKt.createCoroutineUnintercepted(IntrinsicsJvm.kt:122)
at kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch$default(Builders.common.kt:47)
at kotlinx.coroutines.BuildersKt.launch$default(Unknown Source)
at MainKt$main$1.invokeSuspend(Main.kt:17)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
Coroutine StandaloneCoroutine{Active}@70944db3, state: SUSPENDED
at MainKt$main$1$1$1.invokeSuspend(Main.kt:18)
(Coroutine creation stacktrace)
at kotlin.coroutines.intrinsics.IntrinsicsKt__IntrinsicsJvmKt.createCoroutineUnintercepted(IntrinsicsJvm.kt:122)
at kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch$default(Builders.common.kt:47)
at kotlinx.coroutines.BuildersKt.launch$default(Unknown Source)
at MainKt$main$1.invokeSuspend(Main.kt:17)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
Coroutine StandaloneCoroutine{Active}@276103c5, state: SUSPENDED
at MainKt$main$1$1$1.invokeSuspend(Main.kt:18)
(Coroutine creation stacktrace)
at kotlin.coroutines.intrinsics.IntrinsicsKt__IntrinsicsJvmKt.createCoroutineUnintercepted(IntrinsicsJvm.kt:122)
at kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch$default(Builders.common.kt:47)
at kotlinx.coroutines.BuildersKt.launch$default(Unknown Source)
at MainKt$main$1.invokeSuspend(Main.kt:17)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
Collapsing coroutines with the same stack is really useful, the dump is shorter and more readable.
Coroutines dump 2023/01/18 09:52:01
Coroutine BlockingCoroutine{Active}@68ff4cf8, state: RUNNING
at java.base/java.lang.Thread.getStackTrace(Thread.java:1602)
at kotlinx.coroutines.debug.internal.DebugProbesImpl.enhanceStackTraceWithThreadDumpImpl(DebugProbesImpl.kt:339)
at kotlinx.coroutines.debug.internal.DebugProbesImpl.dumpCoroutinesSynchronized(DebugProbesImpl.kt:294)
at kotlinx.coroutines.debug.internal.DebugProbesImpl.dumpCoroutines(DebugProbesImpl.kt:266)
at kotlinx.coroutines.debug.DebugProbes.dumpCoroutines(DebugProbes.kt:144)
at kotlinx.coroutines.debug.DebugProbes.dumpCoroutines$default(DebugProbes.kt:144)
at MainKt$main$1.invokeSuspend(Main.kt:21)
(Coroutine creation stacktrace)
at kotlin.coroutines.intrinsics.IntrinsicsKt__IntrinsicsJvmKt.createCoroutineUnintercepted(IntrinsicsJvm.kt:122)
at kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:58)
at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
at MainKt.main(Main.kt:14)
at MainKt.main(Main.kt)
Coroutine StandaloneCoroutine{Active}@2fd9881b, state: SUSPENDED, and
Coroutine StandaloneCoroutine{Active}@70944db3, state: SUSPENDED, and
Coroutine StandaloneCoroutine{Active}@276103c5, state: SUSPENDED
at MainKt$main$1$1$1.invokeSuspend(Main.kt:18)
(Coroutine creation stacktrace)
at kotlin.coroutines.intrinsics.IntrinsicsKt__IntrinsicsJvmKt.createCoroutineUnintercepted(IntrinsicsJvm.kt:122)
at kotlinx.coroutines.intrinsics.CancellableKt.startCoroutineCancellable(Cancellable.kt:30)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch$default(Builders.common.kt:47)
at kotlinx.coroutines.BuildersKt.launch$default(Unknown Source)
at MainKt$main$1.invokeSuspend(Main.kt:17)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:664)
Other issues regarding the size of the dump: is it a good idea to dump some megabytes of data into the log file? Should a developer copy-paste the relevant data from the log? Is it a reasonable practice? Looking at JEP 425 is a good idea, but I cannot exclude that developer generate a dump in a separate file instead to write it into the log (like a core dump), so a JSON, YAML, or a binary exporter (JSON+GZIP, ProtoBuf+GZIP) can be useful. So my previous comment: making a dump pluggable (like serialization) should avoid putting too much logic in this library. For this use case, also integration with JFR can be helpful, but maybe this is another task.
From https://openjdk.org/jeps/425:
The thread dump is another popular tool for troubleshooting applications written in the thread-per-request style. Unfortunately, the JDK's traditional thread dump, obtained with jstack or jcmd, presents a flat list of threads. This is suitable for dozens or hundreds of platform threads, but is unsuitable for thousands or millions of virtual threads. Accordingly, we will not extend traditional thread dumps to include virtual threads, but will rather introduce a new kind of thread dump in jcmd to present virtual threads alongside platform threads, all grouped in a meaningful way. Richer relationships among threads can be shown when programs use structured concurrency.
Maybe it makes sense to look at what jcmd
does or even repeat their format for better tooling compatibility. For example, they include the structured concurrency relationships in the output.
For now, jcmd
supports only JSON as a better visualisation format for lots of virtual threads.
(Check out the jcmd thread dump demo).
For example, here is the structured thread dump: root StructuredTaskScope
and 2 child scopes that run 1000 virtual threads each (stacktraces and the list of threads are collapsed).
jcmd pid Thread.dump_to_file -format=json dump.json
Introduction
Currently
kotlinx-coroutines-debug
provides the following API for dumping coroutines:DebugProbes.dumpCoroutines
prints all active coroutines, including their state, creation and suspension stacktraces. Though the output of this method is similar tojstack
, it only presents a flat list of coroutines, that is rather difficult to analyse in case of a large number of coroutines.DebugProbes.dumpCoroutinesInfo
provides a list of objects with information about active coroutines.DebugProbes.printJob
/DebugProbes.printScope
prints isolated parts of coroutines hierarchy referenced by aJob
orCoroutineScope
instances. Though these methods can not provide suspension stacktrace.For now, there is no coroutine dump that would group coroutines in a meaningful way visualising relationships of structured concurrency and providing a stack trace. This proposal aims to come up with a more suitable dump format.
Proposed solution:
DebugProbes.dumpCoroutines
can provide coroutine dump in JSON format in addition to plain text:DebugProbes.dumpCoroutines(System.out, CoroutineDumpFormat.JSON)
Pros:
Cons:
Possible alternatives:
Plain text + indents
We can still print coroutine dump as a plain text, but coroutines may be sorted by their parent and indents will show their relationships.
Example output:
Pros: