Closed snickerbockers closed 5 years ago
this bug was introduced by commit 619a44474616b09eb63a1bb3489fdcbf72efadc7, which is the one that added a jit implementation for EXTU.B .
Prior to 0f02afbd6a8bfc8a0061df49697c1f36d7c1e40f, it hangs. From that commit onwards, it crashes due to a bad memory address.
The JIT_OP_SIGN_EXTEND_8 implementation in the x86_64 backend is at least partially responsible. The below patch fixes the bug by implementing the sign-extend in software. This means that x86asm_movsx_reg8_reg32 isn't doing what I think it does.
I'm still not sure if the change in behavior from 0f02afbd6a8bfc8a0061df49697c1f36d7c1e40f is another facet of this bug or a separate issue.
a/src/libwashdc/jit/x86_64/code_block_x86_64.c b/src/libwashdc/jit/x86_64/code_block_x86_64.c
index 4c34ef84..a4863a3a 100644
--- a/src/libwashdc/jit/x86_64/code_block_x86_64.c
+++ b/src/libwashdc/jit/x86_64/code_block_x86_64.c
@@ -859,14 +859,33 @@ static void emit_read_16_constaddr(struct code_block_x86_64 *blk,
static void emit_sign_extend_8(struct code_block_x86_64 *blk,
struct il_code_block const *il_blk,
void *cpu, struct jit_inst const *inst) {
+ struct x86asm_lbl8 lbl_out, lbl_set;
+ x86asm_lbl8_init(&lbl_out);
+ x86asm_lbl8_init(&lbl_set);
+
unsigned slot_no = inst->immed.sign_extend_8.slot_no;
grab_slot(blk, il_blk, inst, slot_no);
unsigned reg_no = slots[slot_no].reg_no;
- x86asm_movsx_reg8_reg32(reg_no, reg_no);
+ x86asm_testl_imm32_reg32(0x80, reg_no);
+
+ x86asm_jnz_lbl8(&lbl_set);
+
+ x86asm_andl_imm32_reg32(0xff, reg_no);
+ x86asm_jmp_lbl8(&lbl_out);
+
+ x86asm_lbl8_define(&lbl_set);
+ x86asm_orl_imm32_reg32(0xffffff00, reg_no);
+
+ x86asm_lbl8_define(&lbl_out);
+
+ /* x86asm_movsx_reg8_reg32(reg_no, reg_no); */
ungrab_slot(slot_no);
+
+ x86asm_lbl8_cleanup(&lbl_out);
+ x86asm_lbl8_cleanup(&lbl_set);
}
// JIT_OP_SIGN_EXTEND_16 implementation
Root-cause of the problem is not correctly including the REX prefix for certain instruction encodings. x86_64 requires the REX to be present in order to reference DIL, SIL, BPL, and SPL. Otherwise it will access AH, BH, CH, or DH instead.
This also means that AH, BH, CH, and DH are inaccessible whenever a REX is needed, but I was never going to use them anyways.
fixed by commit f0a9e0e863d6f69ac34af3b6d2f8addb77de9a0f
Not sure when this was introduced. It still works with -p (interpreter mode) and -j (JIT IL interpreter), but on the native x86_64 backend it crashes before the attract mode.