rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
94.74k stars 12.21k forks source link

Suboptimal machine code mapping results or options #117551

Open davidzeng0 opened 8 months ago

davidzeng0 commented 8 months ago

https://godbolt.org/z/GjrE4GjhK

for all 3 maybe_with_data functions, the move of certain values on the stack is unnecessary.

This seems like a fundamental issue and probably won't be fixed soon, but it should make all existing code run slightly faster

saethlin commented 8 months ago

The behavior here is specific to array lengths that are not a power of 2. I think this is a consequence of https://github.com/rust-lang/rust/pull/115236, the code looks quite well-optimized if the array length is 4 or 8.

davidzeng0 commented 8 months ago

I'd say that's still a problem, because any moves to stack shouldn't happen in the first place

I have the following structs

pub struct Packet {
    buffer: BufferRef, /* 3 usizes */
    data: *mut u8,
    size: usize,

    pub time_base: Rational, /* u32, u32 */
    pub duration: u64,
    pub timestamp: i64,
    pub track_index: u32,
    pub flags: BitFlags<PacketFlag>,
    pub zero_padding: usize
}

pub struct Element {
    pub id: ElementId, /* u64 */
    pub size: u64,
    pub offset: u64,
    pub end: u64
}

and multiple non-inline and inline calls to read(&mut self) -> Result<Option<Packet / Element>>

I mainly use maybe_with_data's match statement rather than as_mut().map() (which doesn't work either) to reduce the 1 additional indentation in my code

Even though the function call is inlined, I see multiple mov reg, [rsp + 0x300] and mov [rsp + 0x110], reg instructions copying the data between different places on the stack, at my match statement succeeding

Especially when my code does around 100 nanos per packet, those extra vectorized and nonvectorized moves add significant overhead that I shouldn't get

I'm only covering Option and Result in this issue, but it should also apply to enums and any function that invites rustc to move a value, even when it's unnecessary. It seems to be related to https://github.com/rust-lang/rust/issues/99504

davidzeng0 commented 8 months ago

Here is clang, with similar code to an Option in C++ https://godbolt.org/z/aPv9nPbn6. There is no mention of rsp at all, until you uncomment the inline never attribute in which case all optimizations go out the window. Gcc produces similar code to clang

saethlin commented 8 months ago

Yes it's still a problem, but I would say it is a problem with arrays, not with map.