Terraspace / UASM

UASM - Macro Assembler
http://www.terraspace.co.uk/uasm.html
Other
220 stars 49 forks source link

Structure at the beginning of PROC is generated before prologue #116

Closed vid512 closed 2 years ago

vid512 commented 5 years ago

In following example, the structure CODEMARK (which encodes "ud1" instruction) is compiled before PROC prologue. I would expect it to lie behind the prologue:

option stackbase:rsp
option frame:auto
.code

CODEMARK struct byte
 member1 dw 0B90Fh   ;0F B9 = opcode of "ud1" instruction
CODEMARK ends

test1 PROC USES rcx rdx
  CODEMARK <>
  jmp r
r: RET
test1 ENDP

end

The resulting object file contains:

.text:0000000000000000 test1           proc near
.text:0000000000000000                 ud1
.text:0000000000000002                 push    rcx
.text:0000000000000003                 push    rdx
.text:0000000000000004                 sub     rsp, 8
.text:0000000000000008                 jmp     short $+2
.text:000000000000000A loc_A:                                  ; CODE XREF: test1+8↑j
.text:000000000000000A                 add     rsp, 8
.text:000000000000000E                 pop     rdx
.text:000000000000000F                 pop     rcx
.text:0000000000000010                 retn
.text:0000000000000010 test1           endp
john-terraspace commented 5 years ago

I’m not sure if I’d really expect any defined behaviour from trying to do that. I can see if we can change that behaviour but I guess the question is why would you do that using a structure? 😊

From: vid512 notifications@github.com Sent: 19 October 2019 12:36 To: Terraspace/UASM UASM@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [Terraspace/UASM] Structure at the beginning of PROC is generated before prologue (#116)

In following example, the structure CODEMARK (which encodes "ud1" instruction) is compiled before PROC prologue. I would expect it to lie behind the prologue:

option stackbase:rsp

option frame:auto

.code

CODEMARK struct byte

member1 dw 0B90Fh ;0F B9 = opcode of "ud1" instruction

CODEMARK ends

test1 PROC USES rcx rdx

CODEMARK <>

jmp r

r: RET

test1 ENDP

end

The resulting object file contains:

.text:0000000000000000 test1 proc near

.text:0000000000000000 ud1

.text:0000000000000002 push rcx

.text:0000000000000003 push rdx

.text:0000000000000004 sub rsp, 8

.text:0000000000000008 jmp short $+2

.text:000000000000000A loc_A: ; CODE XREF: test1+8↑j

.text:000000000000000A add rsp, 8

.text:000000000000000E pop rdx

.text:000000000000000F pop rcx

.text:0000000000000010 retn

.text:0000000000000010 test1 endp

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/116?email_source=notifications&email_token=AEAZAVCIZKESAQTZTHXCECLQPLWILA5CNFSM4JCPM53KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HS5M5IA , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVCK2F7QZI563L4UXYTQPLWILANCNFSM4JCPM53A .

mazegen commented 5 years ago

Is this related?

04/17/2013, v2.10:

Bugfixes:

  • a struct definition just after PROC or LOCAL may have triggered prologue generation, and thus error "statement not allowed inside struct definition" did appear.
vid512 commented 5 years ago

We are parsing and manipulating the code with custom program. For that utility, we insert various marks and hints into the code. They instruct the program on how to handle some things about the code. Those marks/hints are implemented as a structure, used inside the code.

john-terraspace commented 4 years ago

Can you try instead of a struct creating a MACRO which writes the mark out as db's (raw data)? The macro when inserted into the proc should evaluate first after the prologue.

mazegen commented 4 years ago

Hi john, I work with vid512 on the same project.

Both DB and instantiating a STRUCT generate byte(s). I wonder why only DB causes generating the prologue. Also according to the changelog, it seems the expected behavior changed with v2.10. I didn't check the source code but was it really a good decision to change it back then?

(Anyway, I really appreciate all your hard work, UASM is a great tool.)

john-terraspace commented 4 years ago

2.10 was before I took over (Japheth days) so I’m not sure what the reasoning would be.. I assume there was a good reason as he was quite pedantic about everything!

The reason I think is that if you use a macro which expands to a db it evaluates after the prologue is complete. I think the prologue generation is triggered by the first expandable item encountered after the proc, for example LOCALS need to evaluate before the prologue, so it knows what to do.

You could possibly put the struct declaration into a macro and that would also work, as long as it’s not a raw/direct data declaration it should come post-prologue.

From: mazegen notifications@github.com Sent: Saturday, February 1, 2020 9:01 PM To: Terraspace/UASM UASM@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/UASM] Structure at the beginning of PROC is generated before prologue (#116)

Hi john, I work with vid512 on the same project.

Both DB and instantiating a STRUCT generate byte(s). I wonder why only DB causes generating the prologue. Also according to the changelog, it seems the expected behavior changed with v2.10. I didn't check the source code but was it really a good decision to change it back then?

(Anyway, I really appreciate all your hard work, UASM is a great tool.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/116?email_source=notifications&email_token=AEAZAVCMSIDN5LKY3EE6GXLRAXPHHA5CNFSM4JCPM53KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKRGP4I#issuecomment-581068785 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVBSJEF47NQUXJ6Q7TLRAXPHHANCNFSM4JCPM53A .

vid512 commented 4 years ago

It turns out this in not just a problem with structure. Any data defined by DB just behind LOCALs is moved before the prologue code in same way.

option stackbase:rsp
option frame:auto
.code
testDB PROC args:VARARG
LOCAL loc1:qword
  db 0CCh
  RET
testDB ENDP
end

becomes:

.text:0000000000000000 testDB          proc near
.text:0000000000000000                 int     3
.text:0000000000000001                 sub     rsp, 8
.text:0000000000000005                 add     rsp, 8
.text:0000000000000009                 retn
.text:0000000000000009 testDB          endp

It could make some sense to allow things like declaration of structure (STRUCT xyz ... ENDS) in the middle of LOCALS by moving the prologue behind the structure declaration. But I can't see any reason why this should be done with data definition. If defining instructions with DB is legal, then this definitively is a bug.

Seems to me that the logic for determining when prologue code should be emitted simply doesn't check for data definition, like it does for instruction, and skips over data definitions. Forgetting that instruction can be defined with data directives (or structure instance) as well.

vid512 commented 4 years ago

Maybe this behavior was meant as a (bad) solution to allow code like this:

PROC xyz
LOCAL loc1:byte
.data
well_placed_string db "Here I am!",0
.code
LOCAL loc2:byte
john-terraspace commented 4 years ago

Possibly, but that looks really dumb and no one should do stuff like that!

Did you check to see if the data item works when embedded in a macro ? (out of curiosity).. either way I agree that data items should happen post prologue.

From: vid512 notifications@github.com Sent: Monday, February 3, 2020 9:55 AM To: Terraspace/UASM UASM@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/UASM] Structure at the beginning of PROC is generated before prologue (#116)

Maybe this behavior was meant as a (bad) solution to allow code like this:

PROC xyz LOCAL loc1:byte .data well_placed_string db "Here I am!",0 .code LOCAL loc2:byte

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/116?email_source=notifications&email_token=AEAZAVAST22WY5ACMT5Y7F3RA7SYTA5CNFSM4JCPM53KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKTGE5Q#issuecomment-581329526 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVEIZQSYXUTML7PP4YLRA7SYTANCNFSM4JCPM53A .

vid512 commented 4 years ago

Macro doesn't help. Got same result with following code:

option stackbase:rsp
option frame:auto
.code
int3 MACRO
  db 0CCh
ENDM
testDB PROC args:VARARG
LOCAL loc1:qword
  int3
  RET
testDB ENDP
end

The .data in the middle of LOCALs could theoretically be a by-product of some more general macro, which assumes it can define data wherever it is used. I can't think of 100% realistic example, nor do I think this has to be supported. But at least, it should be well-defined what is and what is not allowed in the middle of LOCALs definition.

john-terraspace commented 4 years ago

Agreed.. Honestly anything apart from LOCAL should trigger prologue generation, then that should suite everyone I think.

From: vid512 notifications@github.com Sent: Monday, February 3, 2020 10:42 AM To: Terraspace/UASM UASM@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/UASM] Structure at the beginning of PROC is generated before prologue (#116)

Macro doesn't help. Got same result with following code:

option stackbase:rsp option frame:auto .code int3 MACRO db 0CCh ENDM testDB PROC args:VARARG LOCAL loc1:qword int3 RET testDB ENDP end

The .data in the middle of LOCALs could theoretically be a by-product of some more general macro, which assumes it can define data wherever it is used. I can't think of 100% realistic example, nor do I think this has to be supported. But at least, it should be well-defined what is and what is not allowed in the middle of LOCALs definition.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/116?email_source=notifications&email_token=AEAZAVA4345YEAITUJVDPPLRA7YINA5CNFSM4JCPM53KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKTLAGQ#issuecomment-581349402 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVDF5QRNZ3VHVGH45MDRA7YINANCNFSM4JCPM53A .

john-terraspace commented 3 years ago

This should now be available in 2.51 (in branch and binaries on site). An anonymous data declaration should now be generated after prologue.. ie. db 0xff or dd etc. Give it a test and let me know.

vid512 commented 3 years ago

Unfortunately, it's not fixed yet.

It seems this was only fixed for data defined directly with 'db', but not for data defined by an instance structure. The example from 1st post still produces incorrect code.

Another example:

option casemap:none    ;needed for windows.inc
option fieldalign:8
option stackbase:rsp
option win64:2
OPTION literals:on

.code
code_start:

hint STRUCT
 hint_opcd DB ?
hint ENDS

func1 PROC
  LOCAL a:DWORD
  hint <90h>
  RET
func1 ENDP

end

Result:

.text:0000000000000000 func1           proc near               ; DATA XREF: .pdata:0000000000000020↓o
.text:0000000000000000                 nop
.text:0000000000000001                 sub     rsp, 8
.text:0000000000000005                 add     rsp, 8
.text:0000000000000009                 retn
.text:0000000000000009 func1           endp
john-terraspace commented 3 years ago

Correct, as I mentioned I only implemented direct definitions with ‘db’,’dd’ etc. These might ( I didn’t check ) also work when embedded as a macro. Would that not solve your problem (at least in the interim without structs)?

From: vid512 notifications@github.com Sent: Wednesday, March 3, 2021 2:56 PM To: Terraspace/UASM UASM@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/UASM] Structure at the beginning of PROC is generated before prologue (#116)

Unfortunately, it's not fixed yet.

It seems this was only fixed for data defined directly with 'db', but not for data defined by an instance structure. The example from 1st post still produces incorrect code.

Another example:

option casemap:none ;needed for windows.inc

option fieldalign:8

option stackbase:rsp

option win64:2

OPTION literals:on

.code

code_start:

hint STRUCT

hint_opcd DB ?

hint ENDS

func1 PROC

LOCAL a:DWORD

hint <90h>

RET

func1 ENDP

end

Result:

.text:0000000000000000 func1 proc near ; DATA XREF: .pdata:0000000000000020↓o

.text:0000000000000000 nop

.text:0000000000000001 sub rsp, 8

.text:0000000000000005 add rsp, 8

.text:0000000000000009 retn

.text:0000000000000009 func1 endp

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/116#issuecomment-789772557 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVFMMOQV53XKPUGQT2DTBZEZDANCNFSM4JCPM53A .

vid512 commented 3 years ago

OK, I took it that "hint <90h>" counts as "anonymous data declaration". Data in macro works fine. I'll see if we can use this for temporary workaround to the original problem . Thanks.

john-terraspace commented 3 years ago

Yeah the parser has some very ugly hard-coded ordering rules in which certain elements must be processed and I didn’t want to cause too many regressions in one go. Struct instances are handled in a different part of the parser so that will need some more thinking as to how it phases with the prologue generation.

From: vid512 notifications@github.com Sent: Wednesday, March 3, 2021 3:08 PM To: Terraspace/UASM UASM@noreply.github.com Cc: John Hankinson john@terraspace.co.uk; Comment comment@noreply.github.com Subject: Re: [Terraspace/UASM] Structure at the beginning of PROC is generated before prologue (#116)

OK, I took it that "hint <90h>" counts as "anonymous data declaration". Data in macro works fine. I'll see if we can use this for temporary workaround to the original problem . Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Terraspace/UASM/issues/116#issuecomment-789782277 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAZAVGJHI455ZJDPAGA4ULTBZGFJANCNFSM4JCPM53A .

vid512 commented 3 years ago

Just FYI, the original problem with structure has been worked around using the macro. Thanks.