Terraspace / UASM

UASM - Macro Assembler
http://www.terraspace.co.uk/uasm.html
Other
214 stars 47 forks source link

OR rax,64bit_imm assembles as OR rax,0 and should generate a warning. #187

Open john-terraspace opened 1 year ago

mazegen commented 11 months ago

This is an important issue. It seems like the high dword is silently discarded (it's not only OR but also CMP and possibly others). This should throw an error because it causes serious bugs.

john-terraspace commented 1 month ago

This appears to be fixed in 2.56 already, but double checking for 2.57

mazegen commented 1 month ago

What I ment might be a different but related bug:

This code:

.code
start:
or  rax, 99999999h
cmp rax, 99999999h
add rax, 99999999h
END

Assembles to:

UASM v2.56, Oct 27 2022, Masm-compatible assembler.

bug_imm64.asm
                                .code
00000000                        start:
00000000  480D99999999          or  rax, 99999999h
00000006  483D99999999          cmp rax, 99999999h
0000000C  480599999999          add rax, 99999999h
                                END

This is not correct because the imm32 is sign-extended to FFFFFFFF99999999h.

And similar problem is with this code:

.code
start:
or  rax, 0FFFFFFFF11111111h
cmp rax, 0FFFFFFFF11111111h
add rax, 0FFFFFFFF11111111h
END

Assembles incorrectly to:

UASM v2.56, Oct 27 2022, Masm-compatible assembler.

bug_imm64_2.asm
                                .code
00000000                        start:
00000000  480D11111111          or  rax, 0FFFFFFFF11111111h
00000006  483D11111111          cmp rax, 0FFFFFFFF11111111h
0000000C  480511111111          add rax, 0FFFFFFFF11111111h
                                END
john-terraspace commented 1 month ago

Interestingly, I can reproduce this issue - but I can't get any of these instructions to assemble in other tools either - getting errors.

mazegen commented 1 month ago

Yes, all of them are not encodable in x64 machine code.

john-terraspace commented 1 month ago

Ok - will add some more rules to prevent these assembling then - they should generate errors.

john-terraspace commented 1 month ago

New results in 2.57:

Test Piece:

    .model flat
    .code

start:
or  rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large
and rax, -5                   ;ok => valid
and eax, -1                   ;ok => valid
mov eax, 0xFFFFFFFF           ;ok => valid

or  rax, 0x7fffffff           ;ok => valid
or  rax, 0x80000000           ;ok => valid (Defuse won't assemble this - but it should!)

or  rax, 99999999h            ;ok => Error A2237: Constant value too large
cmp rax, 99999999h            ;ok => Error A2237: Constant value too large
add rax, 99999999h            ;ok => Error A2237: Constant value too large

or  rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large
cmp rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large
add rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large

mov  rax, 0x8000000000000007  ;ok => valid
cmp  rax, 0x8000000000000007  ;ok => Error A2237: Constant value too large
or   rax, 0xffffffff0         ;ok => Error A2237: Constant value too large
or   rax, 8000000000000000h   ;ok => Error A2237: Constant value too large
mov  rax, 8000000000000000h   ;ok => valid as 48 B8 00 00 00 00 00 00 00 80    movabs rax, 0x8000000000000000
and  rax, 8000000000000000h   ;ok => Error A2237: Constant value too large
test rax, 800000000000000h    ;ok => Error A2237: Constant value too large
test rax, 8000000000000000h   ;ok => Error A2237: Constant value too large
test rax, 80000000000000000h  ;ok => Error A2237: Constant value too large
mov  rax, 80000000000000000h  ;ok => Error A2237: Constant value too large

end start

Outputs: or rax, 0FFFFFFFF11111111h test.asm(6) : Error A2237: Constant value too large: FFFFFFFF11111111h or rax, 99999999h test.asm(14) : Error A2237: Constant value too large: 99999999h cmp rax, 99999999h test.asm(15) : Error A2237: Constant value too large: 99999999h add rax, 99999999h test.asm(16) : Error A2237: Constant value too large: 99999999h or rax, 0FFFFFFFF11111111h test.asm(18) : Error A2237: Constant value too large: FFFFFFFF11111111h cmp rax, 0FFFFFFFF11111111h test.asm(19) : Error A2237: Constant value too large: FFFFFFFF11111111h add rax, 0FFFFFFFF11111111h test.asm(20) : Error A2237: Constant value too large: FFFFFFFF11111111h cmp rax, 0x8000000000000007 test.asm(23) : Error A2237: Constant value too large: 8000000000000007h or rax, 0xffffffff0 test.asm(24) : Error A2237: Constant value too large: FFFFFFFF0h or rax, 8000000000000000h test.asm(25) : Error A2237: Constant value too large: 8000000000000000h and rax, 8000000000000000h test.asm(27) : Error A2237: Constant value too large: 8000000000000000h test rax, 800000000000000h test.asm(28) : Error A2237: Constant value too large: 800000000000000h test rax, 8000000000000000h test.asm(29) : Error A2237: Constant value too large: 8000000000000000h test rax, 80000000000000000h test.asm(30) : Error A2275: Constant value too large: 80000000000000000h mov rax, 80000000000000000h test.asm(31) : Error A2275: Constant value too large: 80000000000000000h test.asm: 33 lines, 1 passes, 11 ms, 0 warnings, 15 errors

This corresponds to what I get error wise when testing the same instructions on Defuse, apart from: or rax, 0x80000000 ;ok => valid (Defuse won't assemble this - but it should!) Which as far as I can tell is perfectly valid and sign extendable to 64bit.

mazegen commented 1 month ago

The following:

or rax, 0x80000000

Can't be assembled because 32-bit immediate value 0x80000000 is sign-extended by CPU to 0xFFFFFFFF80000000. Therefore only the following can be assembled:

or rax, 0x7FFFFFFF   ; all unsigned values from 0 to 0x7FFFFFFF are valid
or rax, 0xFFFFFFFF80000000   ; all unsigned values from 0xFFFFFFFF80000000 to 0xFFFFFFFFFFFFFFFF are valid
john-terraspace commented 1 month ago

I have to say this is very confusing... 0x80000000 is a valid 32bit signed immediate and it's extension to 64bit as 0xffffffff80000000 seems sensible to me, the value conforms and represents the same negative integer?

image

and

image

or RAX, imm32 ; imm32 is treated as a signed 32bit integer and extended to 64bit according to it's definition

otherwise how would you be able to do OR RAX,-1 with valid sign extension. Other assemblers give the following:

OR RAX,-1 48 83 c8 ff

OR RAX,0xffffffff ; error!

These are the same thing (granted the first variant is treating -1 as a signed 8bit imm).

image

It seems VS has no problem understanding the encoding with 0xffffffff and it works - performing the same operation as or rax,-1

mazegen commented 1 month ago

If you do this:

mov rax, 1
mov rbx, 0x80000000
add rax, rbx

and

mov rax, 1
mov rbx, 0xFFFFFFFF80000000
add rax, rbx

The result will be different. It's the same like the following (if it would be encodable):

mov rax, 1
add rax, 0x80000000

and

mov rax, 1
add rax, 0xFFFFFFFF80000000

So 0x80000000 is not the same as 0xFFFFFFFF80000000.

As for the screenshots from Calculator: in the first case the 0x80000000 is treated as DWORD but in the second case 0xFFFFFFFF80000000 is treated as QWORD, therefore both -2147483648 decimal. If you would treat 0x80000000 as QWORD, it's 2147483648 decimal.

And you correctly indicate add rax, 99999999h as invalid. It's the same like add rax, 80000000h.

mazegen commented 1 month ago

As for this one:

OR RAX,-1
48 83 c8 ff

OR RAX,0xffffffff ; error!

The OR RAX, -1 is really OR RAX, 0xFFFFFFFFFFFFFFFF that is different from the OR RAX,0xffffffff.

mazegen commented 1 month ago

It's quite simple, the assembler can't accept 32-bit immediate values above or equal 0x80000000 AND below or equal 0xFFFFFFFF7FFFFFFF. This is possible only with MOV where 64-bit immediates are encodable.

john-terraspace commented 1 month ago

Updated the limits. This is the result now:

or  rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large
and rax, -5                   ;ok => valid
and eax, -1                   ;ok => valid
mov eax, 0xFFFFFFFF   ;ok => valid

or  rax, 0x7fffffff           ;ok => valid
or  rax, 0x80000000           ;ok => Error A2237: Constant value too large

or  rax, 99999999h            ;ok => Error A2237: Constant value too large
cmp rax, 99999999h            ;ok => Error A2237: Constant value too large
add rax, 99999999h            ;ok => Error A2237: Constant value too large

or  rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large
cmp rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large
add rax, 0FFFFFFFF11111111h   ;ok => Error A2237: Constant value too large

mov  rax, 0x8000000000000007  ;ok => valid
cmp  rax, 0x8000000000000007  ;ok => Error A2237: Constant value too large
or   rax, 0xffffffff0         ;ok => Error A2237: Constant value too large
or   rax, 8000000000000000h   ;ok => Error A2237: Constant value too large
mov  rax, 8000000000000000h   ;ok => valid as 48 B8 00 00 00 00 00 00 00 80    movabs rax, 0x8000000000000000
and  rax, 8000000000000000h   ;ok => Error A2237: Constant value too large
test rax, 800000000000000h    ;ok => Error A2237: Constant value too large
test rax, 8000000000000000h   ;ok => Error A2237: Constant value too large
test rax, 80000000000000000h  ;ok => Error A2237: Constant value too large
mov  rax, 80000000000000000h  ;ok => Error A2237: Constant value too large
john-terraspace commented 1 month ago

uasm64.zip

Here is a WIN64 build of UASM to try out - if you want to verify the imm 64 handling.

mazegen commented 1 month ago

Yes, after a quick test, it looks correct now :)