Open PallHaraldsson opened 1 year ago
Note that the script is already using
JULIA_INTERPRET_FLAGS = ['--compile=min'] # See: https://github.com/JuliaLang/julia/issues/41360#issuecomment-872075102
I reran with Julia master and got:
Lang-uage | Temp-lated | Check Time [us/fn] | Compile Time [us/fn] | Build Time [us/fn] | Run Time [us/fn] | Check RSS [kB/fn] | Build RSS [kB/fn] | Exec Version | Exec Path |
---|---|---|---|---|---|---|---|---|---|
D | No | 7.6 (3.7x) | 17.1 (10.7x) | 21.2 (12.0x) | 78 (4.0x) | 4.4 (9.8x) | 13.4 (30.1x) | v2.103.0-rc.1-87-g7e84fb3333-dirty | dmd |
D | No | 5.0 (2.4x) | 91.6 (57.1x) | 92.4 (52.4x) | 325 (16.6x) | 4.8 (10.7x) | 19.8 (44.3x) | 1.30.0 | ldmd2 |
D | No | 7.3 (3.5x) | 232.8 (145.2x) | 231.3 (131.0x) | 64 (3.2x) | 4.6 (10.3x) | 19.2 (43.1x) | 11.3.0 | gdc |
D | Yes | 19.9 (9.6x) | 32.2 (20.1x) | 36.0 (20.4x) | 49 (2.5x) | 12.6 (27.8x) | 22.0 (49.4x) | v2.103.0-rc.1-87-g7e84fb3333-dirty | dmd |
D | Yes | 10.5 (5.1x) | 97.7 (61.0x) | 100.0 (56.7x) | 272 (13.9x) | 12.9 (28.6x) | 28.9 (64.8x) | 1.30.0 | ldmd2 |
D | Yes | 13.3 (6.5x) | 244.7 (152.7x) | 241.7 (136.9x) | 62 (3.1x) | 13.4 (29.6x) | 28.8 (64.6x) | 11.3.0 | gdc |
C | No | 2.1 (best) | 1.6 (best) | 1.8 (best) | 20 (best) | 0.5 (best) | 0.4 (best) | 0.9.27 | tcc |
C | No | 9.4 (4.6x) | 293.4 (183.1x) | 303.0 (171.6x) | 36 (1.9x) | 2.7 (6.0x) | 13.6 (30.6x) | 12.1.0 | gcc |
C | No | 5.9 (2.9x) | 207.8 (129.7x) | 203.7 (115.4x) | 60 (3.1x) | 2.7 (6.1x) | 14.2 (31.7x) | 9.5.0 | gcc-9 |
C | No | 6.1 (3.0x) | 217.8 (135.9x) | 219.4 (124.3x) | 37 (1.9x) | 2.7 (6.1x) | 14.2 (31.8x) | 10.4.0 | gcc-10 |
C | No | 6.7 (3.3x) | 228.2 (142.4x) | 221.2 (125.3x) | 38 (1.9x) | 2.6 (5.9x) | 14.1 (31.7x) | 11.3.0 | gcc-11 |
C | No | 10.1 (4.9x) | 298.7 (186.4x) | 299.2 (169.5x) | 23 (1.1x) | 2.8 (6.2x) | 13.6 (30.6x) | 12.1.0 | gcc-12 |
C | No | 18.1 (8.8x) | 119.7 (74.7x) | 120.6 (68.3x) | 612 (31.2x) | 2.1 (4.6x) | sampling error | 14.0.0-1 | clang |
C | No | 18.1 (8.8x) | 115.6 (72.1x) | 118.6 (67.2x) | 545 (27.8x) | 2.1 (4.6x) | 9.4 (21.1x) | 14.0.0-1 | clang-14 |
C++ | No | 14.3 (7.0x) | 233.5 (145.7x) | 233.9 (132.5x) | 38 (1.9x) | 4.4 (9.7x) | 14.0 (31.5x) | 11.3.0 | g++ |
C++ | No | 14.3 (6.9x) | 229.4 (143.1x) | 232.4 (131.7x) | 34 (1.7x) | 4.4 (9.7x) | 14.1 (31.5x) | 10.4.0 | g++-10 |
C++ | No | 14.1 (6.8x) | 228.5 (142.6x) | 236.8 (134.1x) | 37 (1.9x) | 4.4 (9.7x) | 14.0 (31.5x) | 11.3.0 | g++-11 |
C++ | No | 23.1 (11.2x) | 315.3 (196.8x) | 318.3 (180.3x) | 65 (3.3x) | sampling error | 16.4 (36.8x) | 12.1.0 | g++-12 |
C++ | No | 26.0 (12.6x) | 128.9 (80.4x) | 127.9 (72.5x) | 541 (27.6x) | 2.2 (4.8x) | 9.4 (21.1x) | 14.0.0-1 | clang |
C++ | No | 25.2 (12.2x) | 129.4 (80.7x) | 132.7 (75.2x) | 541 (27.6x) | 2.2 (4.8x) | 9.4 (21.1x) | 14.0.0-1 | clang-14 |
C++ | Yes | 30.5 (14.8x) | 278.3 (173.6x) | 277.9 (157.5x) | 28 (1.4x) | 8.0 (17.7x) | 20.5 (46.0x) | 11.3.0 | g++ |
C++ | Yes | 30.9 (15.0x) | 278.2 (173.6x) | 279.6 (158.4x) | 27 (1.4x) | 8.0 (17.6x) | 21.8 (48.9x) | 10.4.0 | g++-10 |
C++ | Yes | 29.1 (14.1x) | 281.9 (175.9x) | 280.7 (159.0x) | 27 (1.4x) | 8.0 (17.7x) | 20.6 (46.1x) | 11.3.0 | g++-11 |
C++ | Yes | 41.7 (20.3x) | 371.1 (231.6x) | 366.9 (207.8x) | 26 (1.3x) | 8.0 (17.7x) | 20.6 (46.1x) | 12.1.0 | g++-12 |
C++ | Yes | 40.0 (19.4x) | 129.5 (80.8x) | 134.5 (76.2x) | 381 (19.5x) | 4.0 (8.8x) | 12.6 (28.3x) | 14.0.0-1 | clang |
C++ | Yes | 39.1 (19.0x) | 132.9 (82.9x) | 136.4 (77.3x) | 622 (31.7x) | 4.0 (8.8x) | 12.6 (28.3x) | 14.0.0-1 | clang-14 |
Ada | No | N/A | N/A | 943.7 (534.7x) | 68 (3.5x) | N/A | 31.3 (70.2x) | 12.1.0 | gnat |
Ada | No | N/A | N/A | 950.3 (538.4x) | 69 (3.5x) | N/A | 31.4 (70.3x) | 12.1.0 | gnat-12 |
Go | No | 16.0 (7.8x) | N/A | N/A | N/A | 4.0 (8.9x) | N/A | 1.18.3 | gotype |
N/A | N/A | N/A | N/A | N/A | N/A | 6.5 (14.5x) | 24.3 (54.4x) | N/A | N/A |
N/A | N/A | N/A | N/A | N/A | N/A | 11.2 (24.8x) | 23.5 (52.7x) | N/A | N/A |
Go | No | N/A | N/A | 166.0 (94.0x) | 132 (6.7x) | N/A | 28.3 (63.4x) | 1.18.3 | go |
N/A | N/A | N/A | N/A | N/A | N/A | N/A | 18.4 (41.1x) | N/A | N/A |
N/A | N/A | N/A | N/A | N/A | N/A | N/A | 50.3 (112.8x) | N/A | N/A |
Zig | No | 22.5 (10.9x) | N/A | 531.6 (301.2x) | 1150 (58.7x) | 5.6 (12.5x) | 34.8 (78.1x) | 0.11.0-dev.2545+311d50f9d | zig |
Zig | Yes | 27.2 (13.2x) | N/A | 547.6 (310.2x) | 1123 (57.3x) | 5.6 (12.5x) | 35.9 (80.5x) | 0.11.0-dev.2545+311d50f9d | zig |
Rust | No | 73.5 (35.7x) | N/A | 230.6 (130.6x) | 1474 (75.2x) | 13.6 (30.1x) | 29.7 (66.6x) | 1.70.0-nightly | rustc |
Rust | Yes | 84.9 (41.2x) | N/A | 148.9 (84.4x) | 1442 (73.6x) | 15.7 (34.8x) | 18.6 (41.6x) | 1.70.0-nightly | rustc |
Nim | No | 36.7 (17.8x) | N/A | 80.5 (45.6x) | 66 (3.3x) | 4.2 (9.3x) | 8.0 (18.0x) | 1.4.6 | nim |
C# | No | N/A | N/A | 21.6 (12.2x) | 384 (19.6x) | N/A | 4.4 (9.8x) | 6.12.0.182 | mcs |
N/A | N/A | N/A | N/A | N/A | N/A | N/A | 13.2 (29.6x) | N/A | N/A |
OCaml | No | N/A | N/A | 445.5 (252.4x) | 637 (32.5x) | N/A | 34.6 (77.5x) | 4.13.1 | ocamlopt |
OCaml | No | N/A | N/A | 87.6 (49.6x) | 907 (46.3x) | N/A | 17.7 (39.6x) | 4.13.1 | ocamlc |
Julia | No | N/A | N/A | 410.5 (232.6x) | N/A | N/A | 25.6 (57.4x) | 1.10.0-DEV | julia |
Julia | Yes | N/A | N/A | 335.6 (190.1x) | N/A | N/A | 25.4 (56.8x) | 1.10.0-DEV | julia |
.
Since the script is using --compile=min, then alternatively you could drop it to see if the default is better, or e.g. -O0.
Anyway, it's at least going in the right direction. And 1.8.0-DEV is of course very outdated, and I expect 1.9.0 to be released in a week or so, so it's time for 1.10.0-DEV.
Using -O0
is slower than --compile=min
. I checked.
Closing this.
Good to know about -O0 (also slower with the default -O2, or -O1?). You can get 25% faster parsing with JuliaSyntax.jl, but since it likely wasn't the bottleneck (your call to check, or decide to use that non-default option), I guess you can ignore it.
They did fix constprop to be faster, but there's no way to drop that optimization completely. Doing away with it, or all opt, doesn't seem like a priority. Because you don't compile code that often. In Julia 1.9, packages are fully precompiled to assembly. [It would be an option to change your code to a package/module, but I don't think a module alone will do it, and I think you want to test the actual compilation time, not ways to get around it.]
You could at least update to the latest numbers, as you did in the table above, to the actual readme. I might look into this extra 25% speed, I understand if not a priority for you, not sure it is for me (i.e. for this benchmark).
FYI: I can confirm with JuliaSyntax.jl (it's easy to use, but for the benchmark as is it's needs to be compiled into the sysimage) I get 21% faster.
Possibly you should try to compile the code for other languages too with optimizations on, i.e. -O2 (or -O3?) for fair comparison with Julia on its defaults? It might at least to be able to see two tables, add another for that.
FYI "Add native UTF-8 Validation using fast shift based DFA #47880" was just merged and it seems 20x faster.
I'm not actually sure if the parser uses it, but instead of looking into it, we can see if the parser gets faster in the next nightly. So you may want to wait with publishing new results. [I only see the new parser calls isvalid for individual Char, not Strings, what would you think Dlang does?]
Hi,
I think you have a long benchmark (or so I recall, maybe only after inlining). I think this might be relevant (to test on when merged to master):
Can you perform the benchmark yourself?
I can, and did (now that that PR was merged).
I do get 12% improvement over 1.9.2, which is though not the great improvement I was hoping for, nor did the PR help. I.e. I get similar on the beta, where I believe it's not in.
$ juliaup default dev
..
| Lang-uage | Temp-lated | Check Time [us/fn] | Compile Time [us/fn] | Build Time [us/fn] | Run Time [us/fn] | Check RSS [kB/fn] | Build RSS [kB/fn] | Exec Version | Exec Path |
| :-------: | ---------- | :----------------: | :------------------: | :----------------: | :--------------: | :---------------: | :---------------: | :----------: | :-------: |
| Julia | No | N/A | N/A | 585.9 (1.2x) | N/A | N/A | 31.9 (1.1x) | 1.11.0-DEV | julia |
| Julia | Yes | N/A | N/A | 489.9 (best) | N/A | N/A | 28.8 (best) | 1.11.0-DEV | julia |
vs. 554.7 on 1.9.2. I also tried all settings for JULIA_INTERPRET_FLAGS and JULIA_COMPILE_FLAGS. I.e. defaults are still much slower, though maybe some improvement there too.
First, you might want to benchmark Julia on master as is (or possibly next nightly, I just noticed yet one more improvement merged just now "Remove alloca from codegen").
I don't know if the issue with your very unusual benchmark is fixed. But Julia does use -O2 by default so you might also want to try running with -O0 (or --inline=no that I think is at least implied by the lowest level) or -O1, since there is no Julia debug/development-build mode, and that's the closest I can think of; Or even with --compile=min
At least if you see an improvement, there's also a further 25% improvement available (but you have to opt into this new Julia parser, it will be merged into Julia, but then also at first off by default):
https://github.com/JuliaLang/JuliaSyntax.jl/pull/228
I also wanted to point that out for you for D (or other) language.