Closed mazegen closed 1 year ago
It would be great if we could generate pure RIP relocation using macro:
RIPREL MACRO lbl:REQ
EXITM <(type lbl) ptr [rip + (lbl-end_of_current_instruction)]
ENDM
mov bl, [RIPREL(a_label)]
The problem is the end_of_current_instruction
symbol. As far as I know, there is no symbol available that represents it. We only have $
, the current location. It would be very useful to have kind of RIP-relative addressing available in the preprocessor through another symbol.
Hi johnsa, is there any chance this can be implemented? It should be easy for labels that are in the same section as the instruction, right? ;)
Hey, yep this was #1 on my list for 2.56 :)
Done. Changes in 2.56 branch. In the parser we no longer generate a fixup entry for a symbol who has RELOC32 in a 64bit section where the current and target sections are the same.
This turns out to be not so trivial... preventing the fixup is easy enough, the two problems are in the way the assembler works that by default the fixup and an addr of 0 are written out, and the fixups are used for backpatching across passes. At the point the fixup is generated we don't have enough info to calculate the proper displacement as the codegen hasn't fully run. That however isn't the main issue, the main issue is that COFF seems to depend on the fixup data for other things, including identifying symbols in disassembly and supporting symbolic debugging information. If you generate the COFF without the fixup, you can't debug properly:
Ok, I think I have it. Updated 2.56 branch. I've removed the COFF relocations, amended the debug data where it's needed and added a custom back-patching to update the RIP before writing the data out. So-far so good on my tests.
Thanks a lot! There's still something to fix. I've tried current 2.56 branch with our project, and there is a crash in backptch.c:202. The "fixup2->sym" being dereferenced is 0. This line:
DebugMsg(("for sym=%s fixup loc %" I32_SPEC "X changed to %" I32_SPEC "X\n", fixup2->sym->name, fixup2->locofs - size, fixup2->locofs ));
Will double check, thanks! - If you just remove the DebugMsg line does it work?
Yes, so far it seems to work. I'll let you know if anything.
It was a silly debug message, I routinely go through and remove them to be honest. They're of little or no value, and in this case dereferencing a null pointer. silly.
Hi john, many thanks for implementing it. However, the following code still generates REL32 COFF relocation. Do you think this can be solved too?
.code
lea rax, [lbl] ; compiles to 488D0500000000 as expected
nop
lbl:
nop
end
dumpbin.exe says:
dumpbin.exe /relocations rip_rel.obj
Microsoft (R) COFF/PE Dumper Version 14.00.24210.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file rip_rel.obj
File Type: COFF OBJECT
RELOCATIONS #1
Symbol Symbol
Offset Type Applied To Index Name
-------- ---------------- ----------------- -------- ------
00000003 REL32 00000000 6 lbl
Sorry, I was too fast, the following code still generates REL32_1, I need to check my uasm build.
.code
cmp dword ptr [data_start], 1
nop
data_start DD 11223344h
end
John: Turns out I was mistakenly testing with master, instead of v2.56 branch. I have also supplied mazegen with this incorrect version, so please disregard his last messages as well. Sorry, I'll test with proper 2.56 and let you know soon.
Results of testing with simple example.
rip_rel.asm contains:
.code
nop
lbl1:
nop
lea rax, [lbl1]
lea rax, [lbl2]
nop
lbl2:
nop
end
Building it with 2.56:
c:\dev\_tools\uasm-2.56\UASM\bin>uasm64 -win64 /Fl rip_rel.asm
UASM v2.56, Oct 11 2022, Masm-compatible assembler.
Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
size shrank from 13 to 12 in pass 2
rip_rel.asm: 10 lines, 3 passes, 5 ms, 0 warnings, 0 errors
126 items in symbol table, expected 126
max items in a line=1, lines with 0/1/<=5/<=10 items=8066/126/0/0,
2174 items in resw table, max items/line=6 [0=619 1=672 397 156 44 8 4 0]
invokation CATSTR=0 SUBSTR=0 SIZESTR=0 INSTR=0 EQU(text)=0
memory used: 402 kB
The resulting .obj seems to have correct displacement and no relocations:
c:\dev\_tools\uasm-2.56\UASM\bin>dumpbin /nologo /disasm /relocations rip_rel.obj
Dump of file rip_rel.obj
File Type: COFF OBJECT
0000000000000000: 90 nop
0000000000000001: 90 nop
0000000000000002: 48 8D 05 F8 FF FF lea rax,[0000000000000001h]
FF
0000000000000009: 48 8D 05 01 00 00 lea rax,[0000000000000011h]
00
0000000000000010: 90 nop
0000000000000011: 90 nop
Summary
0 .data
12 .text
The only problem seems to be the listing file, which reports displacement 0:
UASM v2.56, Oct 11 2022, Masm-compatible assembler.
rip_rel.asm
.code
00000000 90 nop
00000001 lbl1:
00000001 90 nop
00000002 488D0500000000 lea rax, [lbl1]
00000009 488D0500000000 lea rax, [lbl2]
00000010 90 nop
00000011 lbl2:
00000011 90 nop
end
The listing still needs some fixing.
Another weird problem. The current 2.56 version reports "symbol redefinition" errors, when there is no symbol redefinition. This seems to be somehow triggered by using PROC.
win.asm:
option casemap:none ;needed for windows.inc
include windows.inc
.code
xx PROC
RET
xx ENDP
end
Trying to build it with 2.56:
c:\dev\_tools\uasm-2.56\UASM\bin>uasm64 -win64 /Fl /Ic:\dev\_tools\uasm\WinInc\Include win.asm
UASM v2.56, Oct 11 2022, Masm-compatible assembler.
Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
THREAD_PRIORITY_BELOW_NORMAL EQU ( THREAD_PRIORITY_LOWEST + 1 )
c:\dev\_tools\uasm\WinInc\Include\winbase.inc(508) : Error A2143: Symbol redefinition: THREAD_PRIORITY_BELOW_NORMAL
c:\dev\_tools\uasm\WinInc\Include\winbase.inc(508): Included by
c:\dev\_tools\uasm\WinInc\Include\windows.inc(112): Included by
win.asm(2): Main line code
THREAD_PRIORITY_ABOVE_NORMAL EQU ( THREAD_PRIORITY_HIGHEST - 1 )
c:\dev\_tools\uasm\WinInc\Include\winbase.inc(511) : Error A2143: Symbol redefinition: THREAD_PRIORITY_ABOVE_NORMAL
c:\dev\_tools\uasm\WinInc\Include\winbase.inc(511): Included by
c:\dev\_tools\uasm\WinInc\Include\windows.inc(112): Included by
win.asm(2): Main line code
win.asm: 10 lines, 2 passes, 670 ms, 0 warnings, 2 errors
36416 items in symbol table, expected 36416
max items in a line=17, lines with 0/1/<=5/<=10 items=94/429/5304/2316,
2174 items in resw table, max items/line=6 [0=619 1=672 397 156 44 8 4 0]
invokation CATSTR=0 SUBSTR=0 SIZESTR=0 INSTR=0 EQU(text)=13994
memory used: 591016 kB
If you comment the RET, the error disappears. The same example builds fine with UASM 2.52.
I get many more of those false "symbol redefinition" errors with our full codebase. This is the simplest case I've been able to isolate the probem to, without diving into WinInc internals.
The listing wouldn’t have any knowledge about the fixup, there are still fixups, it’s just at the final COFF output stage that they’re excluded and the RIP fix applied.
It might be possible to change this behaviour, but it would be a very big ask.
From: vid512 @.> Sent: Tuesday, October 11, 2022 11:49 AM To: Terraspace/UASM @.> Cc: John Hankinson @.>; Comment @.> Subject: Re: [Terraspace/UASM] RIP-relative addressing and unnecessary COFF relocations (#115)
Results of testing with simple example.
rip_rel.asm contains:
.code nop lbl1: nop lea rax, [lbl1] lea rax, [lbl2] nop lbl2: nop end
Building it with 2.56:
c:\dev_tools\uasm-2.56\UASM\bin>uasm64 -win64 /Fl rip_rel.asm UASM v2.56, Oct 11 2022, Masm-compatible assembler. Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved. Source code is available under the Sybase Open Watcom Public License.
size shrank from 13 to 12 in pass 2 rip_rel.asm: 10 lines, 3 passes, 5 ms, 0 warnings, 0 errors 126 items in symbol table, expected 126 max items in a line=1, lines with 0/1/<=5/<=10 items=8066/126/0/0, 2174 items in resw table, max items/line=6 [0=619 1=672 397 156 44 8 4 0] invokation CATSTR=0 SUBSTR=0 SIZESTR=0 INSTR=0 EQU(text)=0 memory used: 402 kB
The resulting .obj seems to have correct displacement and no relocations:
c:\dev_tools\uasm-2.56\UASM\bin>dumpbin /nologo /disasm /relocations rip_rel.obj
Dump of file rip_rel.obj
File Type: COFF OBJECT
0000000000000000: 90 nop 0000000000000001: 90 nop 0000000000000002: 48 8D 05 F8 FF FF lea rax,[0000000000000001h] FF 0000000000000009: 48 8D 05 01 00 00 lea rax,[0000000000000011h] 00 0000000000000010: 90 nop 0000000000000011: 90 nop
Summary
0 .data
12 .text
The only problem seems to be the listing file, which reports displacement 0:
UASM v2.56, Oct 11 2022, Masm-compatible assembler.
rip_rel.asm .code 00000000 90 nop 00000001 lbl1: 00000001 90 nop 00000002 488D0500000000 lea rax, [lbl1] 00000009 488D0500000000 lea rax, [lbl2] 00000010 90 nop 00000011 lbl2: 00000011 90 nop end
The listing still needs some fixing.
— Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/115#issuecomment-1274493034 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVHMU2TW43RG42GYSO3WCVAYXANCNFSM4I5YKJ5Q . You are receiving this because you commented.Message ID: @.***>
Thanks. I had noticed this one as well, I didn’t have a very simple reproducible case for it though.
From: vid512 @.> Sent: Tuesday, October 11, 2022 12:44 PM To: Terraspace/UASM @.> Cc: John Hankinson @.>; Comment @.> Subject: Re: [Terraspace/UASM] RIP-relative addressing and unnecessary COFF relocations (#115)
Another weird problem. The current 2.56 version reports "symbol redefinition" errors, when there is no symbol redefinition. This seems to be somehow triggered by using PROC.
win.asm:
option casemap:none ;needed for windows.inc include windows.inc
.code
xx PROC RET xx ENDP
end
Trying to build it with 2.56:
c:\dev_tools\uasm-2.56\UASM\bin>uasm64 -win64 /Fl /Ic:\dev_tools\uasm\WinInc\Include win.asm UASM v2.56, Oct 11 2022, Masm-compatible assembler. Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved. Source code is available under the Sybase Open Watcom Public License.
THREAD_PRIORITY_BELOW_NORMAL EQU ( THREAD_PRIORITY_LOWEST + 1 ) c:\dev_tools\uasm\WinInc\Include\winbase.inc(508) : Error A2143: Symbol redefinition: THREAD_PRIORITY_BELOW_NORMAL c:\dev_tools\uasm\WinInc\Include\winbase.inc(508): Included by c:\dev_tools\uasm\WinInc\Include\windows.inc(112): Included by win.asm(2): Main line code THREAD_PRIORITY_ABOVE_NORMAL EQU ( THREAD_PRIORITY_HIGHEST - 1 ) c:\dev_tools\uasm\WinInc\Include\winbase.inc(511) : Error A2143: Symbol redefinition: THREAD_PRIORITY_ABOVE_NORMAL c:\dev_tools\uasm\WinInc\Include\winbase.inc(511): Included by c:\dev_tools\uasm\WinInc\Include\windows.inc(112): Included by win.asm(2): Main line code win.asm: 10 lines, 2 passes, 670 ms, 0 warnings, 2 errors 36416 items in symbol table, expected 36416 max items in a line=17, lines with 0/1/<=5/<=10 items=94/429/5304/2316, 2174 items in resw table, max items/line=6 [0=619 1=672 397 156 44 8 4 0] invokation CATSTR=0 SUBSTR=0 SIZESTR=0 INSTR=0 EQU(text)=13994 memory used: 591016 kB
If you comment the RET, the error disappears. The same example builds fine with UASM 2.52.
I get many more of those false "symbol redefinition" errors with our full codebase. This is the simplest case I've been able to isolate the probem to, without diving into WinInc internals.
— Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/115#issuecomment-1274552127 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVE62GFBDVCYQCRDQJTWCVHHHANCNFSM4I5YKJ5Q . You are receiving this because you commented.Message ID: @.***>
I believe the DUPLICATE SYMBOL issue is now resolved, it was a result of the change to improve the listing outputs in another issue. Please try again and let me know.
No more "duplicate symbol" errors with our codebase. This problem seems resolved.
Tomorrow I'll try to switch to the RIP-relative addressing project-wide, and we'll see if any new errors pop out. Fingers crossed.
Could this be some recent regression?
win.asm:
end
Causes:
c:\dev\_tools\uasm-2.56\UASM\bin>uasm64 -win64 win.asm -Fl win.lst
UASM v2.56, Oct 11 2022, Masm-compatible assembler.
Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
win.asm: 1 lines, 2 passes, 1 ms, 0 warnings, 0 errors
124 items in symbol table, expected 124
max items in a line=1, lines with 0/1/<=5/<=10 items=8068/124/0/0,
2174 items in resw table, max items/line=6 [0=619 1=672 397 156 44 8 4 0]
invokation CATSTR=0 SUBSTR=0 SIZESTR=0 INSTR=0 EQU(text)=0
memory used: 401 kB
STDFUNC MACRO method:REQ, retType:REQ, protoDef:VARARG
win.lst(0) : Error A2099: END directive required at end of file <----------------
win.lst: 0 lines, 1 passes, 3 ms, 0 warnings, 1 errors
124 items in symbol table, expected 124
max items in a line=1, lines with 0/1/<=5/<=10 items=8068/124/0/0,
2174 items in resw table, max items/line=6 [0=619 1=672 397 156 44 8 4 0]
invokation CATSTR=0 SUBSTR=0 SIZESTR=0 INSTR=0 EQU(text)=0
memory used: 401 kB
Same thing happens with listing for basically any file I've tried. Latest commit was "update sysv abi invoke".
Also, it would make sense to disallow addressing like [rip + label + 2*rax]
. At the moment, this is same as [label + 2*rax]
, eg. standard base+scale*index+displacement
addressing, with COFF relocation on the displacement, without anything RIP-relative there.
I think it's clearly a bug because lea rax, [rip+rax]
assembles to lea rax, [rax+00000000]
with SIB byte (48 8D 04 05 00000000
)
Are you sure this is the latest from 2.56 branch? I can't recreate the listing problem and I've tried on a number of sources now, oddly however I can't use the assembler with the command line options as you specify, that doesn't work at all it needs to be uasm -win64 -Fl=out.lst win.asm Let me know how that goes, I'll investigate preventing that EA mode. I believe the only valid option is [RIP + disp], so that would include a label. adding an index/or scale shouldn't be allowed?
Now I understand what was happening. I must have had win.lst preexisting from previous (correct) command, and then I wrongly assumed /Fl takes operand in a getopt-y way (with space instead of '='). So, with my command line, UASM first assembled win.asm, then it somehow ignored -Fl without value, and tried to assemble win.lst. Failing, because it couldn't find 'end' directive there. Sorry about another false alarm. I've been out of touch with these tools for some time, now I do stupid mistakes like this.
Don't worry.. me too, I have so little time these days when I find a gap during the year I do a sudden burst of asm related work. Things are further compounded because I'm also doing stuff on 68k (Amiga) assembler and then I start trying to put operands the wrong way around :)
I've updated the branch again, it should now prevent [ RIP+REG ] from being encoded in any way. There are a few combinations which are technically valid, although not very useful where you can do RIP+DISP like lea rax,[rip+lbl] but it requires NOLARGEADDRESSAWARE and an ADDR32 reloc.
Hi johnsa, thanks for this feature, it works well :) and in many cases, it actually makes the size of .obj much smaller because there are much less relocations now.
Hi,
the following code generates RIP-relative addressing with
REL32_1
COFF relocation:The relocation must be generated because
data_start
lies in another section.The following code generates RIP-relative addressing and also
REL32_1
relocation:However, in this case the COFF relocation is unnecessary because
data_start
is in the same section. This code is effectively the same as:This code works right and generates no relocations.
Can you get rid of the COFF relocation in this case?