Kakadu / zanuda

OCaml linter
GNU Lesser General Public License v3.0
63 stars 8 forks source link

string-concat linter shouldn't recommend Printf #12

Closed edwintorok closed 1 year ago

edwintorok commented 1 year ago

https://kakadu.github.io/zanuda/lints/index.html#string_concat recommends String.concat instead of ^ (good), but it also recommends Format.printf. 2 problems with that:

Here is a very quickly written microbenchmark:

let conc3 a b c d = a ^ b ^ c ^ d
let concp a b c d = Printf.sprintf "%s%s%s%s" a b c d
let concf a b c d = Format.sprintf "%s%s%s%s" a b c d
let concs a b c d = String.concat "" [a;b;c;d]

let s = String.init 100000 (fun _ -> 'x')

open Core
open Core_bench

let () =
    Command_unix.run @@ Bench.make_command
    [ Bench.Test.create ~name:"^" (fun () -> conc3 s s s s)
    ; Bench.Test.create ~name:"Printf.sprintf" (fun () -> concp s s s s)
    ; Bench.Test.create ~name:"format.sprintf" (fun () -> concf s s s s)
    ; Bench.Test.create ~name:"String.concat" (fun () -> concs s s s s)
]

Results with OCaml 5.0 on AMD Ryzen 9 7950X:

┌────────────────┬──────────┬─────────┬──────────┬──────────┬────────────┐
│ Name           │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
├────────────────┼──────────┼─────────┼──────────┼──────────┼────────────┤
│ ^              │  75.87us │         │ 112.51kw │          │     60.35% │
│ Printf.sprintf │ 120.61us │  74.01w │ 164.70kw │    3.60w │     95.94% │
│ format.sprintf │ 125.72us │ 309.00w │ 164.72kw │   24.70w │    100.00% │
│ String.concat  │  39.38us │  12.00w │  50.00kw │    0.50w │     31.32% │
└────────────────┴──────────┴─────────┴──────────┴──────────┴────────────┘

I think this is because Printf/Format would internally use a buffer that grows (similar problem to the repeated concatenation with ^), whereas String.concat allocates the correct size to begin by summing up the string sizes.

Kakadu commented 1 year ago

Thank you for your report. It looks like the standard ^ is much more performant than I though.

I did some benchmarks, https://github.com/Kakadu/fastware/tree/master/concat_str and I started thinking that this lint is not worth mentioning at all?.. What do you think?

edwintorok commented 1 year ago

I think it is still useful to avoid ^ when used in non-constant contexts, e.g. List.fold_left ( ^ ) is probably a bad idea (where the list could be thousands of elements long), using String.concat or a Buffer would be better. If you just have a small constant number of ^ then it probably doesn't matter much what you use.

Can the linter distinguish between the two usages?