softwaremill / tapir

Rapid development of self-documenting APIs
https://tapir.softwaremill.com
Apache License 2.0
1.37k stars 422 forks source link

[BUG] StackOverflowError when request body has collection with 2000 elements #3265

Closed serhiizapalskyi closed 1 year ago

serhiizapalskyi commented 1 year ago

Tapir version: 1.8.2

Scala version: 2.13.11

Json4s version: 4.0.6

Hello, guys, could you please help? I'm getting StackOverflowError when processing request body containing JSON payload with big array containing 2000+ Strings

I've tried with different tapir, sttp and json4s versions, behavior is the same.

How to reproduce? Here is a code of simple endpoint. Request is attached. Stacktrace is attached. endpoint .in( ApiSection / "path" / path[String]("any") ) .in(jsonBody[ExploreQueryRequest].examples(documentation.queryDatasource.inExamples)) .in(applicationJson) .post .out(stringBody) .serverLogicSuccess(_ => Future.successful("Hello world"))

When changing line .in(jsonBody[Request].examples(documentation.queryDatasource.inExamples)) to .in(stringBody) error doesn't occur

13:42:54.352 [armeria-common-worker-kqueue-2-4] WARN i.n.u.c.AbstractEventExecutor - A task raised an exception. Task: com.linecorp.armeria.common.DefaultContextAwareRunnable@406e3b62 java.lang.StackOverflowError: null at scala.collection.IterableOnceOps.toArray$(IterableOnce.scala:1339) at scala.collection.AbstractIterable.toArray(Iterable.scala:933) at scala.collection.immutable.ArraySeq$.$anonfun$newBuilder$1(ArraySeq.scala:286) at scala.collection.mutable.Builder$$anon$1.result(Builder.scala:85) at scala.collection.StrictOptimizedIterableOps.filterImpl(StrictOptimizedIterableOps.scala:231) at scala.collection.StrictOptimizedIterableOps.filterImpl$(StrictOptimizedIterableOps.scala:222) at scala.collection.immutable.ArraySeq.filterImpl(ArraySeq.scala:35) at scala.collection.StrictOptimizedIterableOps.filter(StrictOptimizedIterableOps.scala:218) at scala.collection.StrictOptimizedIterableOps.filter$(StrictOptimizedIterableOps.scala:218) at scala.collection.immutable.ArraySeq.filter(ArraySeq.scala:35) at sttp.tapir.generic.auto.SchemaMagnoliaDerivation.mergeAnnotations(SchemaMagnoliaDerivation.scala:109) at sttp.tapir.generic.auto.SchemaMagnoliaDerivation.subtypeNameToSchemaName(SchemaMagnoliaDerivation.scala:54) at sttp.tapir.generic.auto.SchemaMagnoliaDerivation.$anonfun$split$4(SchemaMagnoliaDerivation.scala:86) at magnolia1.SealedTrait.rec$1(interface.scala:642) at magnolia1.SealedTrait.split(interface.scala:647) at sttp.tapir.generic.auto.SchemaMagnoliaDerivation.$anonfun$split$3(SchemaMagnoliaDerivation.scala:84) at sttp.tapir.Schema.applyValidation(Schema.scala:234) at sttp.tapir.Schema.$anonfun$applyValidation$4(Schema.scala:221) at scala.Option.map(Option.scala:242) at sttp.tapir.Schema.$anonfun$applyValidation$3(Schema.scala:221) at scala.collection.immutable.List.flatMap(List.scala:293) at sttp.tapir.Schema.applyFieldsValidation$1(Schema.scala:221) at sttp.tapir.Schema.applyValidation(Schema.scala:229) at sttp.tapir.Schema.$anonfun$applyValidation$11(Schema.scala:235) at scala.Option.map(Option.scala:242) at sttp.tapir.Schema.applyValidation(Schema.scala:235) at sttp.tapir.Schema.$anonfun$applyValidation$13(Schema.scala:237) at scala.Option.map(Option.scala:242) at sttp.tapir.Schema.applyValidation(Schema.scala:237) at sttp.tapir.Schema.$anonfun$applyValidation$4(Schema.scala:221) at scala.Option.map(Option.scala:242) at sttp.tapir.Schema.$anonfun$applyValidation$3(Schema.scala:221) at scala.collection.immutable.List.flatMap(List.scala:293) at sttp.tapir.Schema.applyFieldsValidation$1(Schema.scala:221) .....

13:43:14.295 [armeria-common-worker-kqueue-2-5] WARN c.l.a.s.DefaultUnhandledExceptionsReporter - Observed 1 exception(s) that didn't reach a LoggingService in the last 10000ms(10000000000ns). Please consider adding a LoggingService as the outermost decorator to get detailed error logs. One of the thrown exceptions: com.linecorp.armeria.server.RequestTimeoutException: null at com.linecorp.armeria.server.RequestTimeoutException.get(RequestTimeoutException.java:36) at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:468) at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) at com.linecorp.armeria.common.DefaultContextAwareRunnable.run(DefaultContextAwareRunnable.java:45) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) at io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:174) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:167) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.kqueue.KQueueEventLoop.run(KQueueEventLoop.java:300) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829)

com.linecorp.armeria.server.RequestTimeoutException: null at com.linecorp.armeria.server.RequestTimeoutException.get(RequestTimeoutException.java:36) at com.linecorp.armeria.internal.common.CancellationScheduler.invokeTask(CancellationScheduler.java:468) at com.linecorp.armeria.internal.common.CancellationScheduler.lambda$init0$2(CancellationScheduler.java:123) at com.linecorp.armeria.common.DefaultContextAwareRunnable.run(DefaultContextAwareRunnable.java:45) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) at io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:174) at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:167) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) at io.netty.channel.kqueue.KQueueEventLoop.run(KQueueEventLoop.java:300) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829)

request.json stacktrace.txt

Thank you in advance! Best regards, Serhii Zapalskyi

adamw commented 1 year ago

Can you share the ExploreQueryRequest data structure, and the way validations are applied to it? Or some simplified version, which would reproduce the issue? I can't get the StackOverflow locally

serhiizapalskyi commented 1 year ago

The structure is a little bit complicated. And I've updated the request and code for this simplified structure.

 endpoint
      .in(extractFromRequest(identity))
      .in(
        ApiSection / "query" / path[String]("any")
      )
      .in(jsonBody[Request])
      .in(applicationJson)
      .post
      .out(stringBody)
      .serverLogicSuccess(req => Future.successful("Success"))

// **Data structure**

import org.json4s._

case class Request(filter: SimpleFilter)

sealed trait SimpleFilter {
  def name: Option[String]

  def typeName: String
}

sealed trait SimpleLeafFilter extends SimpleFilter {
  def fieldName: String
}

case class SimpleEqualsFilter(fieldName: String, value: SimpleLiteral, name: Option[String] = None)
  extends SimpleLeafFilter {

  override def typeName: String = "equals"
}

case class SimpleInFilter(fieldName: String, values: List[SimpleLiteral], name: Option[String] = None)
  extends SimpleLeafFilter {

  override def typeName: String = "in"

}

case class SimpleAndFilter(filters: List[SimpleFilter], name: Option[String] = None) extends SimpleBinaryFilter {
  override def typeName: String = "and"
}

sealed trait SimpleBinaryFilter extends SimpleFilter {
  val filters: List[SimpleFilter]
}

sealed trait SimpleLiteral 

sealed trait SimpleNumericLiteral extends SimpleLiteral 

case class Integer(value: Long) extends SimpleNumericLiteral 

case class Float(value: Double) extends SimpleNumericLiteral 

case class Text(value: String) extends SimpleLiteral

case class Bool(value: Boolean) extends SimpleLiteral 

case class GenericArray(values: Vector[SimpleLiteral]) extends SimpleLiteral

case object Null extends SimpleLiteral 

/** Implementation of parsing for in-house filter expression language.
 * This filter expression language is heavily influenced by https://druid.apache.org/docs/latest/querying/filters.html
 */
case object SimpleFilterParser {

  private def literalConverter(jVal: JValue): SimpleLiteral = {
    jVal match {
      case JString(s) => Text(s)
      case JInt(i)    => Integer(i.toLong)
      case JLong(l)   => Integer(l)
      case JBool(b)   => Bool(b)
      case JDouble(d) => Float(d)
      case _          => throw new IllegalStateException(s"Don't know how to parse $jVal")
    }
  }

  def parse(filterExpr: JValue): SimpleFilter = {
    val name = filterExpr \ "name" match {
      case JString(s)   => Some(s)
      case JNothing | _ => None
    }

    val fieldOpt = (filterExpr \ "field") match {
      case JString(f) => Some(f)
      case _          => None
    }

    (filterExpr \ "type").asInstanceOf[JString].s match {
      case "in" =>
        val array  = (filterExpr \ "values1").asInstanceOf[JArray].arr
        val values = array.map(literalConverter)
        val field  = fieldOpt.get
        SimpleInFilter(field, values, name)

      case "equals" =>
        val field = fieldOpt.get
        val value = filterExpr \ "value1"
        SimpleEqualsFilter(field, literalConverter(value), name)

      case "and" =>
        val subFilters = (filterExpr \ "filters1").asInstanceOf[JArray]
        SimpleAndFilter(subFilters.arr.map(f => parse(f.asInstanceOf[JObject])), name)

      case x =>
        throw new IllegalStateException(s"Couldn't parse filter of type $x")
    }
  }
}

object SimpleFilterSerializer
    extends CustomSerializer[SimpleFilter](_ =>
      ( {
        case jv: JValue =>
          SimpleFilterParser.parse(jv)
      }, {
        case x: SimpleFilter =>
          ???
      }
      )
)

implicit val formats: Formats = DefaultFormats.preservingEmptyValues + SimpleFilterSerializer

request.json

serhiizapalskyi commented 1 year ago

By the way, we've found the workaround.

If I add the code below (in addition to import sttp.tapir.generic.auto._) I don't get StackOverflowError

  implicit def filterSchema: Schema[SimpleFilter]   = Schema.derived[SimpleFilter]
  implicit def literalSchema: Schema[SimpleLiteral] = Schema.derived[SimpleLiteral]
  implicit def textSchema: Schema[Text]       = Schema.derived[Text]
adamw commented 1 year ago

Thanks for the work-around, this hints at sth with Magnolia, but then the number of elements also matters. I've so far minimized this to:

object X extends App {
  case class SimpleInFilter(values: List[SimpleLiteral])

  sealed trait SimpleLiteral

  sealed trait SimpleNumericLiteral extends SimpleLiteral
  case class Text(value: String) extends SimpleLiteral
  case class GenericArray(values: Vector[SimpleLiteral]) extends SimpleLiteral

  //

  import sttp.tapir._
  import sttp.tapir.generic.auto._

  println(implicitly[Schema[SimpleInFilter]].applyValidation(SimpleInFilter((1 to 1000).map(i => Text(s"a$i")).toList)))
}
adamw commented 1 year ago

It does seem to be an issue with Magnolia's derivation, and with the mixed types of collections. If there's only Vectors, or only Lists, then the schemas are derived correctly, and there's no stack overflow.

adamw commented 1 year ago

So this turns out to be an issue with the combination of the following:

this causes problems with caching the intermediate results, so that they can be reused in the recursive calls. I'm not sure if a fix is possible, given Scala 2's macro+implicits interactions, and if so, it would be quite hard to arrive at.

So instead I'm going to suggest the work-arounds (mostly mentioned above):

dima-dermanskyi-wm commented 1 year ago

Thank you @adamw