gnieh / fs2-data

streaming data parsing and transformation library
https://fs2-data.gnieh.org
Apache License 2.0
153 stars 27 forks source link

Significant performance degradation with new json rendering Introduced in 1.11 #634

Open recons opened 2 months ago

recons commented 2 months ago

In our case, we are dealing with a large stream of big objects, which are serialized into json and written to a file. After upgrading to 1.11, we noticed higher execution time, an increase in cpu usage and more time spent in gc.

Below is a minimal code snippet to demonstrate the issue:

fs2.Stream.range(0, 10000000)
      .map(Json.fromInt)
      .covary[IO]
      .through(ast.tokenize)
      .through(wrap.asTopLevelArray)
      .through(render.compact)
      .compile
      .drain
      .timed
      .map(_._1.toMillis.toString())
      .flatMap(IO.println)

On my machine (MBP 16 2019, 2.6 GHz 6-Core Intel Core i7), the above code takes:

While this isn't a full-fledged analysis, we suspect that the regression is related to the new rendering in 1.11. We plan to stick with version 1.10 for now, but further insight would be appreciated. Perhaps I'm missing something? Could this performance hit be expected under certain conditions, or is it likely a regression?

satabin commented 2 months ago

Hi. We changed the rendering in 1.11, which is probably the cause. I am thinking of reverting the compact rendering to what it was in 1.10. The introduced change is too impatfull for this use case.

recons commented 2 months ago

Hi, thanks for your reply.

Unfortunately, it's not just the compact rendering that's been affected. In the same use case, when switching to render.prettyPrint, the process doesn't even complete after 5 minutes. It also produces numerous warnings, such as:

[WARNING] Your app's responsiveness to a new asynchronous event (such as a new connection, an upstream response, or a timer) was in excess of 100 milliseconds. Your CPU is probably starving. Consider increasing the granularity of your delays or adding more cedes. This may also be a sign that you are unintentionally running blocking I/O operations (such as File or InetAddress) without the blocking combinator.

On the other hand, using the deprecated render.pretty takes only 23 seconds and completes without any warnings at all.

satabin commented 2 months ago

The pretty printer has been entirely changed as well. It looks like some ceding is necessary in its implementation. I will investigate.

satabin commented 1 month ago

Hi @recons,

I ran some benchmarks and identified several sources for slow downs. I am now working on fixing them. I will open a PR as soon as I have something usable, so that you can try it with your use case.