Schema instantiation hangs

kubukoz commented 1 year ago

Couldn't reproduce it simpler, so here's a ZIO test:

//> using scala "2.13.10"
//> using lib "com.disneystreaming.smithy4s::smithy4s-json:0.16.6"
//> using lib "dev.zio::zio-test:1.0.17"
//> using lib "dev.zio::zio-test-sbt:1.0.17"

import smithy4s.schema.Schema
import smithy.api.JsonName
import smithy4s.api.Discriminated
import smithy4s.http.CodecAPI
import smithy4s.HintMask
import smithy4s.IntEnum
import smithy4s.http.json.JCodec

import zio.test._
import zio.test.Assertion._
import smithy4s.schema.Schema
import smithy.api.JsonName
import smithy4s.api.Discriminated
import zio.test._
import zio.test.Assertion._
import zio.duration._

object JsonSpec extends DefaultRunnableSpec {

  private val codecs = smithy4s.http.json.codecs(HintMask(IntEnum))

  private def encodeString[A](a: A)(schema: Schema[A]): String = {
    val c = codecs.compileCodec(schema)

    new String(codecs.writeToArray(c, a))
  }

  def spec: ZSpec[Environment, Failure] = suite("Command JSON codec")(
    test("struct schema serialization ignores @jsonName") {
      val schema = Schema
        .struct[String]
        .apply(
          Schema.string
            .addHints(JsonName("b"))
            .required[String]("a", identity)
        )(identity)

      assert(encodeString("test")(schema))(equalTo("""{"a":"test"}"""))
    },
    test("unions are always encoded as tagged") {

      val schema = {
        val i = Schema.int.oneOf[Either[Int, String]]("intKey", Left(_))
        val s = Schema.string.oneOf[Either[Int, String]]("stringKey", Right(_))

        Schema
          .union[Either[Int, String]](i, s)(_.fold(i(_), s(_)))
          .addHints(Discriminated("type"))
      }

      assert(encodeString[Either[Int, String]](Left(42))(schema))(
        equalTo("""{"intKey":42}""")
      )
    }
  ) @@ TestAspect.timeout(2.seconds)

}

Feel free to bump the timeout to 10 seconds or more, chances are you'll hit the timeout more often than not - the spec usually doesn't finish.

I bisected this down to 80a5d912ca101266795f5a425d855c3b1ea04f6a.

kubukoz commented 1 year ago

I'm aware that the above is calling the unary overload of compileCodec:

and it seems like that implementation creates a new cache for each compilation... so I'm not sure why any concurrent usage (confirmed that the tests run concurrently) would break this.

This suffers from the same problem:

private val cache = codecs.createCache()

private def encodeString[A](a: A)(schema: Schema[A]): String = {
  val c = codecs.compileCodec(schema, cache)

  new String(codecs.writeToArray(c, a))
}

Baccata commented 1 year ago

It's not even compilation that seems to hang, the following is sufficient to make it hang (on my machine anyway):

//> using scala "2.13.10"
//> using lib "com.disneystreaming.smithy4s::smithy4s-core:dev-SNAPSHOT"
//> using lib "dev.zio::zio-test:1.0.17"
//> using lib "dev.zio::zio-test-sbt:1.0.17"

import smithy4s.schema.Schema
import smithy.api.JsonName
import smithy4s.api.Discriminated
import zio.test._
import zio.test.Assertion._
import zio.duration._

object JsonSpec extends DefaultRunnableSpec {

  def spec: ZSpec[Environment, Failure] = suite("Command JSON codec")(
    test("struct schema serialization ignores @jsonName") {
      val schema = Schema
        .struct[String]
        .apply(
          Schema.string
            .addHints(JsonName("b"))
            .required[String]("a", identity)
        )(identity)

      assertTrue(true)
    },
    test("unions are always encoded as tagged") {
      val schema = {
        val i = Schema.int.oneOf[Either[Int, String]]("intKey", Left(_))
        val s = Schema.string.oneOf[Either[Int, String]]("stringKey", Right(_))

        Schema
          .union[Either[Int, String]](i, s)(_.fold(i(_), s(_)))
          .addHints(Discriminated("type"))
      }
      assertTrue(true)
    }
  ) @@ TestAspect.timeout(2.seconds)

}

Removing the addHints on either test stops the hanging. I think it may be a classloading issue : having the schemas instantiated within the tests (that run in parallel) hit the classloader concurrently.

Moving the schemas away from the tests fix it too.

kubukoz commented 1 year ago

adding hints causing something to break depending on order reminds me of https://github.com/disneystreaming/smithy4s/pull/385...

any clue why concurrent classloading would be an issue here? It's strange because it wasn't a problem in previous releases (0.16.4 and earlier)

Baccata commented 1 year ago

any clue why concurrent classloading would be an issue here

possibly due to how NewType are encoded : each newtype companion object has a couple internal classes. Anyhow, without profiling it'll be hard to guess.

kubukoz commented 1 year ago

btw. I'm not able to reproduce the hang with your last snippet (replaced the version with 0.16.6)

Baccata commented 1 year ago

Can you extract the schema values away from the tests and see whether it has an effect ? (using your snippet that is)

kubukoz commented 1 year ago

it does help :)

kubukoz commented 1 year ago

I'll pay good money for someone to fix this.

Baccata commented 1 year ago

How much ?

kubukoz commented 1 year ago

like, $200?

kubukoz commented 8 months ago

Guess what, this is not a problem in 0.18.

kubukoz commented 8 months ago

update: after hours of minimization of an internal reproduction, narrowed it down to:

namespace demo

@pattern("^[a-zA-Z0-9]{8,12}$")
string MyStr

enum MyEnum {
  VALUEA
}

package demo

object B extends weaver.FunSuite {

  val dummyEnum =
    MyEnum.schema.required[Unit]("foo", _ => ???)

}

object A extends weaver.FunSuite {

  val dummyStr =
    MyStr("123456789")

}

After this, in order to eliminate codegen, I merged everything into one file: this scala-cli now reproduces the hang.

Update: eliminated other dependencies.

//> using scala "3.3.1"
//> using lib "com.disneystreaming.smithy4s::smithy4s-core:0.18.3"
package demo

import smithy4s.Enumeration
import smithy4s.Hints
import smithy4s.Newtype
import smithy4s.Schema
import smithy4s.ShapeId
import smithy4s.ShapeTag
import smithy4s.schema.EnumTag
import smithy4s.schema.Schema.bijection
import smithy4s.schema.Schema.enumeration
import smithy4s.schema.Schema.string

import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit

object MyStr extends Newtype[String] {
  val id: ShapeId = ShapeId("demo", "MyStr")
  val hints: Hints = Hints.empty

  val underlyingSchema: Schema[String] = string
    .withId(id)
    .addHints(hints)
    .validated(smithy.api.Pattern(s"^[a-zA-Z0-9]{8,12}$$"))

  // this was just Schema[MyStr] but the package object can't live in this file so I inlined it
  implicit val schema: Schema[MyStr.Type] = bijection(underlyingSchema, asBijection)
}

sealed abstract class MyEnum(_value: String, _name: String, _intValue: Int, _hints: Hints)
  extends Enumeration.Value {
  override type EnumType = MyEnum
  override val value: String = _value
  override val name: String = _name
  override val intValue: Int = _intValue
  override val hints: Hints = _hints
  override def enumeration: Enumeration[EnumType] = MyEnum
  @inline final def widen: MyEnum = this
}

object MyEnum extends Enumeration[MyEnum] with ShapeTag.Companion[MyEnum] {
  val id: ShapeId = ShapeId("demo", "MyEnum")

  val hints: Hints = Hints.empty

  case object VALUEA extends MyEnum("VALUEA", "VALUEA", 0, Hints())

  val values: List[MyEnum] = List(
    VALUEA
  )

  val tag: EnumTag[MyEnum] = EnumTag.ClosedStringEnum
  implicit val schema: Schema[MyEnum] = enumeration(tag, values).withId(id).addHints(hints)
}

object B {
  val dummyEnum = MyEnum.schema.required[Unit]("foo", _ => ???)
}

object A {
  val dummyStr = MyStr("123456789")
}

object demo {

  def main(args: Array[String]): Unit = {
    val cl = new CountDownLatch(2)

    def t(f: => Unit) = {
      val thread = new Thread(() => f)
      thread.setDaemon(true)
      thread.start()
    }

    t {
      A.toString()
      cl.countDown()
    }

    t {
      B.toString()
      cl.countDown()
    }

    require(cl.await(2, TimeUnit.SECONDS), "TIMED OUT")
  }

}

kubukoz commented 8 months ago

The snippet above doesn't fail (due to a presumed deadlock) on Scala 2.13 - worth noting Dotty has a new implementation of lazy vals: https://docs.scala-lang.org/scala3/reference/changed-features/lazy-vals-init.html

Baccata commented 8 months ago

update: after hours of minimization of an internal reproduction, narrowed it down to:

We already had a pretty good minimisation, and it's very possible your new one has just unveiled something specific to enumerations, which would be a different issue. Can you remove the Enumeration in favour of another type of schema that wouldn't involve lazy vals ?

I think the initial repro hangs because the Documentation type has a Documentation hint, so there's probably a cycle in the classloading that make it subject to deadlocks

kubukoz commented 8 months ago

The earlier minimization didn't work on the latest version, hence me trying to do it again - this one came from a real app.

has just unveiled something specific to enumerations, which would be a different issue

I minimized it further from the generated code:

//> using scala "3.3.1"
//> using lib "com.disneystreaming.smithy4s::smithy4s-core:0.18.3"
//> using option "-Wunused:imports"
package demo

import smithy4s.Schema

import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit

object MyStr {
  val schema = Schema.int.addHints(smithy.api.Length())
}

object MyEnum {
  val schema = Schema.int.addHints(smithy.api.Required())
}

object B {
  val dummyEnum = MyEnum.schema.toString()
}

object A {
  val dummyStr = MyStr.toString()
}

object demo {

  def main(args: Array[String]): Unit = {
    val cl = new CountDownLatch(2)

    def t(f: => Unit) = {
      val thread = new Thread(() => f)
      thread.setDaemon(true)
      thread.start()
    }

    t {
      A.hashCode()
      cl.countDown()
    }

    t {
      B.hashCode()
      cl.countDown()
    }

    require(cl.await(2, TimeUnit.SECONDS), "TIMED OUT")
  }

}

and it does seem to be Documentation-related. I have yet to confirm if it's classloading or lazy val initialization (or both).

kubukoz commented 8 months ago

YourKit detected a potential deadlock:

Potential deadlock: frozen threads found

It seems that the following threads have not changed their stack for more than 10 seconds.
These threads are possibly (but not necessarily!) in a deadlock or hung.

+----------------------------------------------------------------------------------------------------------------------+
|                                                         Name                                                         |
+----------------------------------------------------------------------------------------------------------------------+
|  +---Thread-0 Frozen for at least 10s <Ignore a false positive>                                                      |
|  | |                                                                                                                 |
|  | +---smithy4s.schema.Field$.required(String, Schema, Function1) Field.scala:85                                     |
|  | |                                                                                                                 |
|  | +---smithy4s.schema.Schema$PartiallyAppliedRequired$.apply$extension(Schema, String, Function1) Schema.scala:271  |
|  | |                                                                                                                 |
|  | +---smithy.api.TraitDiffRule$.<clinit>() TraitDiffRule.scala:29                                                   |
|  | |                                                                                                                 |
|  | +---smithy.api.Trait$.<clinit>() Trait.scala:28                                                                   |
|  | |                                                                                                                 |
|  | +---smithy.api.Documentation$.<clinit>() Documentation.scala:16                                                   |
|  | |                                                                                                                 |
|  | +---smithy.api.Length$.<clinit>() Length.scala:23                                                                 |
|  | |                                                                                                                 |
|  | +---demo.MyStr$.<clinit>() main.scala:12                                                                          |
|  | |                                                                                                                 |
|  | +---demo.A$.<clinit>() main.scala:24                                                                              |
|  | |                                                                                                                 |
|  | +---demo.demo$.main$$anonfun$1(CountDownLatch) main.scala:39                                                      |
|  | |                                                                                                                 |
|  | +---demo.demo$.main$$anonfun$adapted$1(CountDownLatch) main.scala:41                                              |
|  | |                                                                                                                 |
|  | +---demo.demo$$$Lambda$1.0x0000007001001000.apply()                                                               |
|  | |                                                                                                                 |
|  | +---scala.Function0.apply$mcV$sp() Function0.scala:42                                                             |
|  | |                                                                                                                 |
|  | +---demo.demo$.$anonfun$1(Function0) main.scala:33                                                                |
|  | |                                                                                                                 |
|  | +---demo.demo$$$Lambda$2.0x00000070010012c8.run()                                                                 |
|  | |                                                                                                                 |
|  | +---java.lang.Thread.run() Thread.java:833                                                                        |
|  |                                                                                                                   |
|  +---Thread-1 Frozen for at least 10s <Ignore a false positive>                                                      |
|    |                                                                                                                 |
|    +---smithy.api.Required$.<clinit>() Required.scala:18                                                             |
|    |                                                                                                                 |
|    +---demo.MyEnum$.<clinit>() main.scala:16                                                                         |
|    |                                                                                                                 |
|    +---demo.B$.<clinit>() main.scala:20                                                                              |
|    |                                                                                                                 |
|    +---demo.demo$.main$$anonfun$2(CountDownLatch) main.scala:44                                                      |
|    |                                                                                                                 |
|    +---demo.demo$.main$$anonfun$adapted$2(CountDownLatch) main.scala:46                                              |
|    |                                                                                                                 |
|    +---demo.demo$$$Lambda$3.0x00000070010014f8.apply()                                                               |
|    |                                                                                                                 |
|    +---scala.Function0.apply$mcV$sp() Function0.scala:42                                                             |
|    |                                                                                                                 |
|    +---demo.demo$.$anonfun$1(Function0) main.scala:33                                                                |
|    |                                                                                                                 |
|    +---demo.demo$$$Lambda$2.0x00000070010012c8.run()                                                                 |
|    |                                                                                                                 |
|    +---java.lang.Thread.run() Thread.java:833                                                                        |
+----------------------------------------------------------------------------------------------------------------------+

and from the looks of it, it's Required on the surface (Documentation appears in the stack too though)

kubukoz commented 8 months ago

update! Native Image also suffers the same outcome. Wouldn't this exclude classloading?

kubukoz commented 8 months ago

Just had a chat with @Baccata about this.

We're going with the assumption that this is generally caused by the combination of:

lazy initialization of companion objects, and
recursion of traits in smithy.api#documentation, smithy.api#trait, everything that depends on those, and any other custom traits that may be recursive themselves.

Whether it's caused by the encoding of lazy vals alone (which I suspect) or classloading (which I wasn't able to confirm with a profiler), the solution will likely require us to get rid of the recursive initialization, rather than introduce more laziness anywhere - at this point it'd just risk bumping into other similar problems down the line, perhaps even past bootstrapping time (which would be terrible for applications, imagine randomly hanging while handling a request).

We'll proceed with an experiment of the following: generate DynamicBindings for hint values that include a recursive reference. I will do a POC of this by hardcoding this behavior to the relevant hints (Documentation and Trait for starters).

In short, this will apply a change like:

object Documentation extends Newtype[String] {
  val id: ShapeId = ShapeId("smithy.api", "documentation")
  val hints: Hints = Hints(
-    smithy.api.Documentation("Adds documentation to a shape or member using CommonMark syntax."),
+    ShapeId("smithy.api", "documentation") -> Document.fromString("Adds documentation to a shape or member using CommonMark syntax."),
    smithy.api.Trait(selector = None, structurallyExclusive = None, conflicts = None, breakingChanges = None),
  )
  val underlyingSchema: Schema[String] = string.withId(id).addHints(hints)
  implicit val schema: Schema[Documentation] = recursive(bijection(underlyingSchema, asBijection))
}

which goes into the DynamicBinding code path, currently used (primarily? exclusively?) for DynamicSchemaIndex.

kubukoz commented 8 months ago

The experiment seems to have been successful:

  val hints: Hints = Hints(
    ShapeId("smithy.api", "documentation") -> Document.nullDoc,
    ShapeId("smithy.api", "trait") -> Document.nullDoc
  )

with these, there's no longer a hang (ran the code 100 times). I applied the following simplifications:

[ ] only documentation and trait are special-cased
[x] Document.nullDoc is rendered every time, instead of an actual Document representing the trait. This one might require some more changes to how the recursive schemes are applied.
[ ] the hints are rendered this way everywhere, instead of - ideally - only in the object that would recursively refer to itself.

kubukoz commented 8 months ago

Here's where it actually stops (both threads in RUNNING state):

which shows how Documentation is involved: Thread 1 is trying to evaluate the Hints of Required (one of which is Documentation), and Thread 0 is in the middle of Documentation's <clinit>, getting stuck presumably before it can call Required.<clinit>.

Given <clinit> is the initializer of the class, I suppose it's technically correct that this is all due to lazy classloading.

kubukoz commented 6 months ago

OK @baccata, I think it's time to make a choice...

The options attempted are:

The "smart" and complex one: #1295

We detect recursive traits and make the appropriate hints use dynamic bindings.

Advantages: generated code still looks decent, safe for bincompat
Disadvantages: complex on the codegen logic side, overhead of decoding documents in the bindings on each read?

`Hints.lazily`: #1326

We wrap all Hints(...) with Hints.lazily(=>Hints).

Advantages: still looks decent on the generated code side, until you hit enum members (override val needed to avoid NPE); N hints == 1 `Lazy``, safe for bindcompat
Disadvantages: gets ugly with enum members

Lazy static bindings

Advantages: No change on the generated code side, until you hit enum members - similar to the above
Disadvantages: gets ugly with enum members, N hints == N extra Lazy instances, breaks bincompat (which is OK for 0.19)

I like the last option the most, but Hints.lazily also sounds good.

Baccata commented 6 months ago

So far I liked option 3 best, and I think it's fine to target 0.19. I think we ought to measure the impact on memory / compilation time though, before making a final decision, to ensure the overhead is not dramatic.

channingwalton commented 6 months ago

Just in case it helps, I believe I hit the same issue: running tests in parallel resulting in everything hanging. A thread dump revealed various combinations of the following:

"io-compute-6" #722 daemon prio=5 os_prio=31 cpu=50.44ms elapsed=1099.03s tid=0x000000013732fc00 nid=0x16e03 in Object.wait()  [0x00000002de0b8000]
   java.lang.Thread.State: RUNNABLE
        at smithy.api.Documentation$.<clinit>(Documentation.scala:16)
        - waiting on the Class initialization monitor for smithy.api.Trait$
…
"io-compute-2" #718 daemon prio=5 os_prio=31 cpu=35.78ms elapsed=1099.03s tid=0x00000001374fe200 nid=0x1ed03 in Object.wait()  [0x00000002dc087000]
   java.lang.Thread.State: RUNNABLE
        at smithy.api.Trait$.<clinit>(Trait.scala:27)
        - waiting on the Class initialization monitor for smithy.api.Documentation$

kubukoz commented 6 months ago

import cats.effect.IO
import cats.effect.IOApp
import cats.implicits._
import {<confidential> => SimpleRestJsonBuilder}
import org.http4s.ember.server.EmberServerBuilder
import smithy4s.kinds.Kind1
import smithy4s.kinds.Kind5
import smithy4s.kinds.PolyFunction5
import smithy4s.schema.Schema
import smithy4s.schema.Schema.RefinementSchema

import scala.io.StdIn

object MyMain extends IOApp.Simple {

  def measured[A](tag: String)(a: => A): A = {
    val start  = System.nanoTime()
    val result = a
    val end    = System.nanoTime()
    println(s"$tag - Time: ${(end - start) / 1000000}ms")
    result
  }

  measured("init") {
    println(ServiceTags.All.size)
  }

  val allRoutes =
    measured("allRoutes") {
      ServiceTags.All.map { service =>
        SimpleRestJsonBuilder(service)
          .routes(service.impl(new service.FunctorEndpointCompiler[IO] {
            override def apply[A0, A1, A2, A3, A4](
                fa: service.Endpoint[A0, A1, A2, A3, A4]
            ): Kind5[Kind1[IO]#toKind5]#handler[A0, A1, A2, A3, A4] = _ => IO.stub
          }))
          .make
          .toTry
          .get
      }
    }

  val totalRoute = allRoutes.reduce(_ <+> _)

  def run = EmberServerBuilder.default[IO].withHttpApp(totalRoute.orNotFound).build.useForever
}

Tried this with the following setup:

a list of 185 services
5287 generated Scala files in total (some could've been unused by these particular services)
Nothing else going on on the machine
JVM running forked so it was fresh every time, no warmup
M1 Max, so the outcome could be wildly different on another machine
Scala 3.3.1

Before:

init: 2938ms, 2894ms, 2951ms, 2815ms, 2946ms (average 2908.8, stddev 57.08)
allRoutes: 968ms, 959ms, 952ms, 958ms, 946ms (average 956.6, stddev 8.23)

Memory usage: heap stabilizes around 186MB, non-heap around 290MB

After #1327:

init: 3586ms, 3566ms, 3647ms, 3447ms, 3462ms (average 3541.6, stddev 85.09)
allRoutes: 987ms, 980ms, 1080ms, 1049ms, 973ms (average 1013.8, stddev 47.819)

Memory usage: heap stabilizes around 153MB (strange), non-heap around 333MB

To sum up, the initialization time of the objects (init) increased on average by 21.75%, while time to compile the serices increased by 5.98%.

Memory usage decreased on the heap side by 17.75%, increased on non-heap by 14.83%. I suppose the classes are bigger (and there's more of them) due to all the new bytecode supporting the laziness?

I think the ~21% increase in initialization time is a bit high, but also this is 185 services, so not something typical.

Baccata commented 6 months ago

@kubukoz any chance you could run the same experiment with solution 2 (Hints.lazily) as well ?

Even if the experiment is not representative of a nominal case (due to the sheer number of services), we can reasonably assume that the initialisation time increase is representative of the change. If solution 2 yields (much) better numbers, I'd roll with that.

kubukoz commented 6 months ago

Solution 2 (#1326):

init: 3087, 3096, 3082, 3195, 3125ms
allRoutes: 893, 968, 975, 1008, 1032ms

Memory: ...well, hard to talk about this because it went up to ~400MB and then down to ~120, when the first GC cycle hit. Don't know what to make of that... guess I should've ran the previous experiments longer or forced a GC run.

Compared to the status quo, init only gets worse by 7.16%, allRoutes by 1.96%. Clearly beats solution 3 😬

Baccata commented 6 months ago

Alright, let's roll with that one. We can make the diff less prominent by using a by-name extension (Hints.lazily(Hints(....)) => Hints(...).lazily)

kubukoz commented 6 months ago

That's neat, looks like it doesn't make the perf worse either: compared to the "before", in this test round init took only 4.87% longer on average (3050.4ms, stddev 41.03ms), and allRoutes actually got faster by 4.16% (916.8ms, stddev 16.90ms).

I thought these were strange so I re-run the original 0.18.5 setup, it was marginally faster than before (0.5% in init, 3.85% in allRoutes).

Let's stick with .lazily.

disneystreaming / smithy4s