Missed optimization on combining function prologue

I discovered an interesting result when compiling

extern(C) bool test(bool b1, bool b2)
{
    return b1 == b2;
}

and comparing the generated assembly with:

extern(C) bool test(void* b1, void* b2)
{
    return b1 == b2;
}

They are supposed to be semantically the same, according to x86_64 semantics and with register/word aligned memory based calling conventions, although, it seems that the compiler can't optimize the function prologue and I assume it is to not possibly mess with the architecture calling convention, but in some cases the compiler can still optimise the code/combine instructions without messing with it. See this comparison with GCC compiler on x86_64 codegen: https://godbolt.org/z/sxnvnv93P .

llvm / llvm-project

Missed optimization on combining function prologue #58932