janestreet / base

Standard library for OCaml
MIT License
848 stars 124 forks source link

List functions requiring same length arguments have runtime dependent on longest argument #123

Closed mkhan45 closed 2 years ago

mkhan45 commented 2 years ago

I ran into this when writing an interpreter which often attempts to zip lists of very short length with lists of very long length.

Here's an example script:

open Base
open Stdio

let () =
    let len = Array.get (Sys.get_argv ()) 1 |> Int.of_string in
    match List.zip (List.range 0 1) (List.range 0 len) with
    | Ok _ -> printf "Ok\n"
    | _ -> printf "Not Ok :(\n"
~/test Δ time dune exec ./test.exe 10_000_000
Not Ok :(

________________________________________________________
Executed in    1.10 secs      fish           external
   usr time  977.44 millis  312.00 micros  977.13 millis
   sys time  125.85 millis  263.00 micros  125.59 millis

~/test Δ time dune exec ./test.exe 100
Not Ok :(

________________________________________________________
Executed in   38.21 millis    fish           external
   usr time   24.20 millis  292.00 micros   23.91 millis
   sys time   14.00 millis  246.00 micros   13.76 millis

From perf, I saw that check_length2 is the cause of the problem, but it's not immediately obvious to me why after reading through the source.

This simple function fixes it for my use case:

let rec list_equal_len lhs rhs = match lhs, rhs with
    | [], [] -> true
    | [], _ | _, [] -> false
    | _::xs, _::ys -> list_equal_len xs ys
bcc32 commented 2 years ago

Ah, I was confused too for a while, but it turns out this performance bug has been fixed already on GitHub but not yet in opam. If you look at the code for v0.14, which is the latest opam release, you will see the performance issue you identified in the source code:

https://github.com/janestreet/base/blob/e9dc05f5ca7a5a995e95d8617a12b0ecb4f5408b/src/list.ml#L168-L172

mkhan45 commented 2 years ago

That's good to know. Is there a timeline for when the next version will be released on opam?

I guess this issue should be closed

bcc32 commented 2 years ago

Not sure, probably @aalekseyev or @cwong-ocaml would know.

aalekseyev commented 2 years ago

We're aiming for doing it this year, but I'm afraid I don't have a more specific timeline.