dbuenzli / fmt

OCaml Format pretty-printer combinators
http://erratique.ch/software/fmt
ISC License
71 stars 26 forks source link

Fmt.styled gets slower over time #52

Closed talex5 closed 2 years ago

talex5 commented 3 years ago

I've been investigating a problem with my programs slowing down over time, but only when logging is configured. The problem seems to be with fmt (0.8.10)!

With this test program:

let n = 10000

let () =
  while true do
    let t0 = Unix.gettimeofday () in
    for _i = 1 to n do
      ignore @@ Fmt.str "Hello %a" Fmt.(styled `Blue string) "world"
    done;
    let t1 = Unix.gettimeofday () in
    Printf.printf "Formatted %.0f messages/second\n%!" (float n /. (t1 -. t0))
  done

I get:

$ dune exec -- ./test.exe
Formatted 522518 messages/second
Formatted 411017 messages/second
Formatted 298257 messages/second
Formatted 215499 messages/second
Formatted 155861 messages/second
Formatted 110564 messages/second
Formatted 77171 messages/second
Formatted 54181 messages/second
Formatted 37493 messages/second
Formatted 25499 messages/second
Formatted 17254 messages/second

Replacing Fmt.(styled `Blue string) with Fmt.string shows no slow-down.

dbuenzli commented 3 years ago

Can't dive into this right now but I suspect a problem with the strategy to make styles composable. This was introduced by a series of patch that ends here https://github.com/dbuenzli/fmt/commit/594a1d76f6e2e51dd9ee37f67533916fc26b6e9a

dbuenzli commented 3 years ago

Just tried to have a quick look at that.

First my experience is a little bit different that yours. Performance actually goes up and down [1], likely with gc occurring.

The culprit is likely this function. Each Fmt.str uses Fmt.asprintf which allocates a new ppf and that doesn't seem to match the expectations expressed in this commit (see the comment).

Maybe we should switch back to the horrible hack we had before to store arbitrary metadata in the formatter itself by hijacking the tags mecanism.

[1]: Formatted 250658 messages/second Formatted 223419 messages/second Formatted 153010 messages/second Formatted 121745 messages/second Formatted 87265 messages/second Formatted 240923 messages/second Formatted 170360 messages/second Formatted 137997 messages/second Formatted 102244 messages/second Formatted 100184 messages/second Formatted 240310 messages/second Formatted 170856 messages/second Formatted 128041 messages/second Formatted 92355 messages/second Formatted 65714 messages/second Formatted 77845 messages/second Formatted 229642 messages/second

talex5 commented 3 years ago

Hmm, it seems to depend on the version of OCaml: 4.12 goes up and down, but 4.13 only goes down!

dbuenzli commented 3 years ago

That's strange does this uncover a bug ephemerons ?

Note that you can mitigate by creating your own formatter. @pqwy suggests reusing the formatter in Fmt.str but I'm not sure I want to do that with multicore at the corner. Are there plans to add domain local storages API both on stock OCaml and multicore ?

I'm rather tempted to go back to subvert the tags mecanism, which is uncompositional anyways, than using sophisticated gc mecanisms.

talex5 commented 3 years ago

I was able to work around it by configuring my log reporter to use Logs.pp_header instead of Logs_fmt.pp_header. I admit I was a bit surprised to see ephemerons in the memtrace output while debugging this!

dbuenzli commented 3 years ago

The problem is that the Format API does not allow us to associate arbitrary data to formatters (in Fmt's case to indicate whether we want UTF-8 output and ANSI styling). Using ephemerons is one way to side step the issue.

I was able to work around it by configuring my log reporter to use Logs.pp_header instead of Logs_fmt.pp_header.

Well yes Logs.pp_header does not use Fmt.

dbuenzli commented 3 years ago

I was able to work around it by configuring my log reporter to use Logs.pp_header instead of Logs_fmt.pp_header.

Just to make things clear your log reporter eventually outputs using Fmt.str (or Format.asprintf) ?

talex5 commented 3 years ago

My reporter uses Fmt.kstr and outputs with output_string stderr and/or to an in-memory ring-buffer:

https://github.com/talex5/wayland-proxy-virtwl/blob/22d4bca6170068f4a95a93114e48958d8cea54ae/trace.ml#L172

The use of Logs_fmt was left over from an older version (without a ring-buffer), and wasn't doing anything anyway.

dbuenzli commented 3 years ago

My reporter uses Fmt.kstr

Yes that also allocates one formatter on each call.

The use of Logs_fmt was left over from an older version (without a ring-buffer), and wasn't doing anything anyway.

Since you can't access the formatter, these functions use the default style renderer which is None. So yes they wouldn't do anything in your case.

dbuenzli commented 3 years ago

Hmm, it seems to depend on the version of OCaml: 4.12 goes up and down, but 4.13 only goes down!

@talex5 It's a bit odd, I tested with 4.13 on macOS 11.6 with ocamlopt.

First I get an order of magnitude more messages per second (!) and the rate only halves (rather than thirds) before going up again.

dbuenzli commented 3 years ago

@talex5, the plan is to revert to d200ffa4fae40168a5c68d4 (i.e. get rid of ephemerons) but I can't repro:

but 4.13 only goes down!

if it really does it seems a bit worrying and worth pinging upstream.

dbuenzli commented 3 years ago

@talex5 I have pushed a solution for the problem. Could you please confirm that solves the problem for you.

talex5 commented 3 years ago

Thanks - that's a lot faster!

$ dune exec -- ./test.exe
Formatted 4302733 messages/second
Formatted 5373180 messages/second
Formatted 5181351 messages/second
Formatted 5361503 messages/second
Formatted 5730706 messages/second
Formatted 5868622 messages/second
Formatted 5848165 messages/second
Formatted 5350560 messages/second
Formatted 5602116 messages/second
Formatted 5754293 messages/second
dbuenzli commented 2 years ago

Thanks the release is on its way.