rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.66k stars 12.63k forks source link

Suboptimal code generation for thread_local! #104033

Open stepancheg opened 1 year ago

stepancheg commented 1 year ago

Code:

use std::cell::*;

thread_local! {
    static X: Cell<Vec<u32>> = const { Cell::new(Vec::new()) };
}

pub fn thread_local() {
    X.with(|x| {
        let mut xx = x.take();
        xx.pop();
        x.set(xx);
    })
}

Emits:

example::thread_local:
  push rbx
  sub rsp, 16
  lea rdi, [rip + example::X::__getit::STATE.0@TLSLD]
  call __tls_get_addr@PLT
  mov rbx, rax
  movzx eax, byte ptr [rax + example::X::__getit::STATE.0@DTPOFF]
  cmp eax, 1
  je .LBB1_3
  test eax, eax
  jne .LBB1_4
  lea rdi, [rbx + example::X::__getit::VAL@DTPOFF]
  lea rsi, [rip + example::X::__getit::destroy]
  call qword ptr [rip + std::sys::unix::thread_local_dtor::register_dtor@GOTPCREL]
  mov rax, rbx
  mov byte ptr [rbx + example::X::__getit::STATE.0@DTPOFF], 1
.LBB1_3:
  mov rcx, qword ptr [rbx + example::X::__getit::VAL@DTPOFF+16]
  xor edx, edx
  sub rcx, 1
  cmovae rdx, rcx
  mov qword ptr [rbx + example::X::__getit::VAL@DTPOFF+16], rdx
  add rsp, 16
  pop rbx
  ret
.LBB1_4:
  lea rdi, [rip + .L__unnamed_1]
  lea rcx, [rip + .L__unnamed_2]
  lea r8, [rip + .L__unnamed_3]
  lea rdx, [rsp + 8]
  mov esi, 70
  call qword ptr [rip + core::result::unwrap_failed@GOTPCREL]
  ud2

(Compiler explorer)

Here default path (when thread-local is initialized) is after the jump to LBB1_3.

The issue seems to be missing #[cold] annotation in register_dtor function or missing likely(STATE == 1):

https://github.com/rust-lang/rust/blob/1286ee23e4e2dec8c1696d3d76c6b26d97bbcf82/library/std/src/thread/local.rs#L237-L254

BGR360 commented 1 year ago

@rustbot label +I-slow +T-compiler