CrepeGoat commented 11 months ago

system

Roc version github:roc-lang/roc/dc37b7a31d4fa2014ec159eda1d51c27832ddca6
run on macOS 13.4.1 arm64 (M1 macbook air)

setup

main.roc

# references
# - https://github.com/roc-lang/examples/tree/main/examples/CommandLineArgs
app "main"
    packages {
        pf: "https://github.com/roc-lang/basic-cli/releases/download/0.7.0/bkGby8jb0tmZYsy2hg1E_B2QrCgcSTxdUlHtETwm5m4.tar.br",
    }
    imports [pf.Stdout, pf.Task]
    provides [main] to pf

main : Task.Task {} I32
main =
    Stdout.line "w/e"

expect
    n = 86896
    list = List.range { start: At n, end: Before 0 }
    List.sortWith list Num.compare == List.range { start: At 1, end: At n }

problem

Running roc test --optimize terminates after ~20 seconds with the following correct test output:

$ roc test --optimize

0 failed and 1 passed in 20094 ms.

expected behavior

The given code should run faster relative to other languages, as described on the Roc website.

Specifically, I consider the above code to be roughly equivalent to the following Python code:

n = 10000000 # a much larger value!
l = list(reversed(range(n)))
l.sort()
l == list(range(n))

Timing this code via the *nix time command produces the following output:

$ time python3 -c "n = 10000000; l = list(reversed(range(n))); l.sort(); l == list(range(n))"
python3 -c   0.36s user 0.10s system 96% cpu 0.475 total

Note that the above results are for a data input ~300x larger than in the Roc code. (Atm the roc code won't even run for data that large - see #6293.)

Anton-4 commented 11 months ago

Thanks for this nice issue report @CrepeGoat! One thing that comes to mind is that running with the test command (roc test) will include the compilation time. So for a more accurate comparison I would do roc build myfile.roc --optimize and time ./myfile.roc. Note that this would require removing expect and switching to Stdout.line.

Anton-4 commented 11 months ago

Using sortAsc instead of sortWith would also make the comparison more equal. Timing inside roc like we did here may also be useful to gain additional insights.

CrepeGoat commented 11 months ago

Thanks for the kind words @Anton-4 :)

You're right, roc test is probably not the best place to do performance benchmarks; I will keep this in mind for timing comparisons moving forwards. I personally just used roc test because 1) it felt easier at the time, and 2) I didn't expect it to make a ~50x difference (i.e., the 0.36s Python run vs the 20s Roc run). Specifically, from my rough observation the Roc compilation likely took ~1-2 seconds - I'm not sure where the other 18 seconds went.

I'll try using sortAsc and direct internal timing, thanks for the tips! as an aside, does |> List.sortAsc not compile to the same code as |> List.sortWith Num.compare?

Anton-4 commented 11 months ago

does |> List.sortAsc not compile to the same code as |> List.sortWith Num.compare?

It appears they do :) I did not do a diligent check before.

I'm not sure where the other 18 seconds went.

Uhu, I do expect we do have a real performance problem here but it's good to narrow things down.

CrepeGoat commented 11 months ago

It appears they do :) I did not do a diligent check before.

well I did no check before 😂 so thanks for putting in that work to confirm 👍

it's good to narrow things down.

Agreed! I think we're on the same page here 🙂

CrepeGoat commented 10 months ago

Rewrote the example (mostly) as suggested:

main.roc

app "test"
    packages {
        pf: "https://github.com/roc-lang/basic-cli/releases/download/0.7.0/bkGby8jb0tmZYsy2hg1E_B2QrCgcSTxdUlHtETwm5m4.tar.br",
    }
    imports [
        pf.Stdout,
        pf.Task,
    ]
    provides [main] to pf

main : Task.Task {} I32
main =
    n = 86896
    list = List.range { start: At n, end: Before 0 }
    listSorted = List.sortAsc list
    Stdout.line "hello"

Building and running this code now generates the following timing information:

$ roc build --optimize
...(build stuff)

$ time ./main
hello
./main  16.04s user 0.03s system 99% cpu 16.102 total

Running objdump -d main generates ~600k lines of instructions, so I'm electing not to look into the exact instructions myself 😅