rui314 / chibicc

A small C compiler
MIT License
9.57k stars 869 forks source link

Assembler error on larger than 32-bit bitfields #65

Open GabrielRavier opened 2 years ago

GabrielRavier commented 2 years ago
struct S { unsigned long long int b:32; } s;

void f()
{
    s.b = 0;
}

Trying to compile this results in this error:

/tmp/chibicc-0G8AuI: Assembler messages:
/tmp/chibicc-0G8AuI:60: Error: operand type mismatch for `and'
tavianator commented 2 years ago

codegen.c is attempting to and with an immediate that is too large when writing out the bitfield. This patch fixes it for me:

diff --git a/codegen.c b/codegen.c
index da11fd7..c3164aa 100644
--- a/codegen.c
+++ b/codegen.c
@@ -785,7 +785,12 @@ static void gen_expr(Node *node) {
       // from memory and merge it with a new value.
       Member *mem = node->lhs->member;
       println("  mov %%rax, %%rdi");
-      println("  and $%ld, %%rdi", (1L << mem->bit_width) - 1);
+      if (mem->bit_width >= 32) {
+        println("  mov $%ld, %%rax", (1L << mem->bit_width) - 1);
+        println("  and %%rax, %%rdi");
+      } else {
+        println("  and $%ld, %%rdi", (1L << mem->bit_width) - 1);
+      }
       println("  shl $%d, %%rdi", mem->bit_offset);

       println("  mov (%%rsp), %%rax");
rui314 commented 2 years ago

@tavianator Yes, it looks like a bug in codegen. Your code should work, but there's another way. You can shift to the left to clear higher bits and then shift to the right to align bits at a proper bit position. That's I believe what gcc and clang do too: https://godbolt.org/z/G3678rrKz

tavianator commented 2 years ago

@rui314 This is for assignments though, not reads. I think you already do the left+right shift for reads. You could do it to mask the old value but gcc and clang seem to just mask it explicitly: https://godbolt.org/z/6EseE3vo6

rui314 commented 2 years ago

@tavianator Ah, interesting! So gcc and clang use an extra register instead of shifting to the left and to the right. I'm not very familiar with instruction selection, but I guess there must be a reason to choose that instruction sequence. Thank you for letting me know!

tavianator commented 2 years ago

@rui314 You're welcome! Actually though I misread that asm, the mask is applied to the value to be stored, not to the old value from the read/modify/write. Other cases don't use masking at all: https://godbolt.org/z/PxMvMqa4h

rui314 commented 2 years ago

@tavianator It looks like it is an optimization for a field that is at the leftmost bit position in a word. For other cases, and is still being used: https://godbolt.org/z/hThfor7o3