Optimize poor benchmark result

brson commented 12 years ago

http://blog.cdleary.com/2012/06/simple-selfish-and-unscientific-shootout/

We had a particularly poor showing in this comparison.

brson commented 12 years ago

Thanks @marijnh for pointing it out

brson commented 12 years ago

Here's what perf says with an optimized runtime and corelib:

 10.64%  test  libcore-d27e4777a53c3e50-0.2.so  [.] uint::to_str::_f7f4236dcee4291a::_02
  9.34%  test  libc-2.15.so                     [.] _int_free
  7.09%  test  libc-2.15.so                     [.] __memset_x86_64
  6.90%  test  librustrt.so                     [.] upcall_s_vec_grow
  6.01%  test  librustrt.so                     [.] upcall_vec_grow
  5.07%  test  librustrt.so                     [.] exchange_malloc
  4.92%  test  librustrt.so                     [.] upcall_exchange_free
  4.72%  test  librustrt.so                     [.] upcall_exchange_malloc_dyn
  4.43%  test  librustrt.so                     [.] check_stack_canary(stk_seg*)
  4.19%  test  libc-2.15.so                     [.] malloc
  4.11%  test  libc-2.15.so                     [.] _int_malloc
  3.07%  test  librustrt.so                     [.] upcall_s_exchange_free
  3.03%  test  librustrt.so                     [.] check_stack_alignment
  3.00%  test  librustrt.so                     [.] upcall_s_exchange_malloc_dyn
  2.18%  test  librustrt.so                     [.] memory_region::malloc(unsigned long, char const*, bool)
  2.15%  test  libc-2.15.so                     [.] realloc
  1.97%  test  [kernel.kallsyms]                [k] 0xffffffff8103d0ca
  1.91%  test  librustrt.so                     [.] get_sp_limit
  1.82%  test  librustrt.so                     [.] memory_region::free(void*)
  1.53%  test  libc-2.15.so                     [.] _int_realloc
  1.33%  test  librustrt.so                     [.] __morestack
  1.31%  test  librustrt.so                     [.] get_sp
  1.19%  test  librustrt.so                     [.] upcall_call_shim_on_c_stack
  0.95%  test  libc-2.15.so                     [.] free
  0.89%  test  librustrt.so                     [.] memory_region::add_alloc()
  0.68%  test  librustrt.so                     [.] upcall_str_new_uniq

brson commented 12 years ago

It looks like there's a lot to be gained just in optimizing uint::to_str. It is very inefficient.

kud1ing commented 12 years ago

Often in such benchmarks the random number generator is the bottleneck. Can someone point out where this shows up in the analysis above?

eholk commented 12 years ago

I recently added the xorshift random number generator. It's not on by default, but it should be significantly faster than the default ISAAC generator.

https://github.com/mozilla/rust/commit/ad292a8c73a0cceddfa9618a4d6eea577897bae8

pcwalton commented 12 years ago

The biggest win is going to be to write a uint::write or something that directly writes to a file instead of allocating.

brson commented 12 years ago

6e0085210c54150f794d20791b2e9c1fda6049fc makes uint::to_str not allocate so much, though it still does one extra allocation when it creates the initial empty vector.

brson commented 12 years ago

That commit makes the time for time ./test 10000000 go from 32s to 6s.

brson commented 12 years ago

Graydon made another commit to improve it further. I believe we are competitive with the other languages now.

kud1ing commented 12 years ago

Somewhat related is #2105

rust-lang / rust

Optimize poor benchmark result #2501