boostorg / container

STL-like containers from Boost
http://www.boost.org/libs/container/
Boost Software License 1.0
100 stars 116 forks source link

small_vector::push_back without reallocation is 3 times slower than the libstdc++ std::vector::push_back #131

Closed apolukhin closed 3 years ago

apolukhin commented 4 years ago

Compilers inline the priv_forward_range_insert_no_capacity function and that function touches a lot of local variables. The compilers are forced to free up some registers and they push data on the stack.

As a result here's how the small_vector::push_back looks like:

test_boost(boost::container::small_vector<int, 8ul, void, void>&):
  push r15
  push r14
  push r13
  push r12
  push rbp
  push rbx
  mov rbx, rdi
  sub rsp, 8
  mov rax, QWORD PTR [rdi+8]
  mov rdx, QWORD PTR [rdi]
  mov rcx, QWORD PTR [rdi+16]
  lea rbp, [rdx+rax*4]
  cmp rax, rcx
  jnb .L7
  add rax, 1
  mov DWORD PTR [rbp+0], 42
  mov QWORD PTR [rdi+8], rax
.L6:
  add rsp, 8
  pop rbx
  pop rbp
  pop r12
  pop r13
  pop r14
  pop r15
  ret

While the std::vector::push_back looks like the following:

test_std(std::vector<int, std::allocator<int> >&):
  sub rsp, 24
  mov rsi, QWORD PTR [rdi+8]
  mov DWORD PTR [rsp+12], 42
  cmp rsi, QWORD PTR [rdi+16]
  je .L20
  mov DWORD PTR [rsi], 42
  add rsi, 4
  mov QWORD PTR [rdi+8], rsi
  add rsp, 24
  ret

Godbolt playground: https://godbolt.org/z/CvRqMY

Note that adding BOOST_NOINLINE to priv_forward_range_insert_no_capacity does not help.

igaztanaga commented 4 years ago

I can't see how could this should be solved., the inliner takes the wrong decision. Which compiler and version?

apolukhin commented 4 years ago

Reported the issues here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91981 and here https://bugs.llvm.org/show_bug.cgi?id=43562

apolukhin commented 4 years ago

GCC and Clang of almost any version

igaztanaga commented 3 years ago

Since it looks like an inliner issue, I'm closing this. thanks for the report.