Closed elicn closed 2 years ago
Thanks. Would have a look ASAP.
From: Eli @.> Sent: Thursday, September 2, 2021 12:07:05 AM To: unicorn-engine/unicorn @.> Cc: Subscribed @.***> Subject: [unicorn-engine/unicorn] Some of the x86 registers Unicorn enumerates don't really exist (#1440)
Available control registers should be CR0-CR4, CR8. The following registers do not exist:
See:
Available debug registers should be DR0-DR7 (though DR4 and DR5 are not available, they are defined by the arch) The following registers do not exist:
See:
The following registers do not seem to exist
― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/unicorn-engine/unicorn/issues/1440, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHJULO4YXOVDSYECLIXN2VLT7ZFSTANCNFSM5DG2O6BQ.
Link to #1449
I removed some useless registers as you stated, but for FP[x] they are x87 registers.
x87 doesn't have registers named FP.
It does have ST0
..ST7
and R0
..R7
, perhaps you are referring to the latter..?
If so, I believe that naming those registers FP0
..FP7
would be wierd as there is no visible correlation; perhaps FP_R0
..FP_R7
would be clearer..
Consider using a UC_X87_REG
prefix for x87 registers, where these ones may be named UC_X87_REG_R0
..UC_X87_REG_R7
.
References:
x87 doesn't have registers named FP. It does have
ST0
..ST7
andR0
..R7
, perhaps you are referring to the latter..? If so, I believe that naming those registersFP0
..FP7
would be wierd as there is no visible correlation; perhapsFP_R0
..FP_R7
would be clearer..Consider using a
UC_X87_REG
prefix for x87 registers, where these ones may be namedUC_X87_REG_R0
..UC_X87_REG_R7
.References:
Thanks for the reference. Re-open to fix later.
x87 doesn't have registers named FP. It does have
ST0
..ST7
andR0
..R7
, perhaps you are referring to the latter..? If so, I believe that naming those registersFP0
..FP7
would be wierd as there is no visible correlation; perhapsFP_R0
..FP_R7
would be clearer..Consider using a
UC_X87_REG
prefix for x87 registers, where these ones may be namedUC_X87_REG_R0
..UC_X87_REG_R7
.References:
I double-checked that UC_X86_REG_FP[x]
refers to x87 ST registers. Would you like a rename as supposed? @aquynh
I double-checked that
UC_X86_REG_FP[x]
refers to x87 ST registers. Would you like a rename as supposed? @aquynh
There are already registers named UC_X86_REG_ST0
..UC_X86_REG_ST7
.
Does it mean both sets of registers refer to the same arch registers?
I’m afraid that’s the case. I would have a look later.
From: Eli @.> Sent: Monday, October 11, 2021 12:06:06 PM To: unicorn-engine/unicorn @.> Cc: lazymio @.>; State change @.> Subject: Re: [unicorn-engine/unicorn] Some of the x86 registers Unicorn enumerates don't really exist (#1440)
I double-checked that UC_X86_REG_FP[x] refers to x87 ST registers. Would you like a rename as supposed? @aquynhhttps://github.com/aquynh
There are already registers named UC_X86_REG_ST0..UC_X86_REG_ST7. Does it mean both sets of registers refer to the same arch registers?
― You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/unicorn-engine/unicorn/issues/1440#issuecomment-939881192, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHJULO7TGUXEGIR5GFPVGUDUGKZI5ANCNFSM5DG2O6BQ.
i think those FP registers can be removed.
For the record, referencing cr1
will also throw #UD
. See Intel SDM, Vol. 2B, 4-40: MOV—Move to/from Control Registers. Sneaky one, eh?
This is indeed a tricky one. Even though CR1 cannot be accessed by software, it does exist [see Intel SDM vol. 3 chapter 2.2] and may be accessed by microcode. Unicorn may choose to hold its state (even though I am not sure it is documented anywhere visible outside Intel) and make it available for users to directly read or write in their scripts.
In the bottom line, since CR1 is technically implemented but inaccessible through emulated assembly instruction, I guess it is up to Unicorn to decide whether they keep it or not. Both approaches make sense.
For that matter, cr0
-cr15
are all addressable. From what I can tell, cr1
only exists insofar as it's labeled reserved.
It's true that CR0
-CR15
are all addressable [i.e. may be encoded as an instruction operand], but I didn't refer to that..
From what I recall CR1
does exist, but it is not accessible by software (that is, macro isntructions), as opposed to the CR7
, CR9
- CR15
that are inaccessible because they do not exist.
From a software perspective I agree that there is not much of a difference there, but there is a thin line between "inaccessible" and "not exist". Since Unicorn is an emulator [and not a simulator] I don't see many reasons for Unicorn to keep it, but my issue was originally about non-existant registers. Sicne CR1
technically exists, I didn't list it.
All that being said, your comment does show that there should be a difference between Capstone and Unicorn on this one. While Unicorn doesn't have a reason to maintain those registers (or even define them, for that matter), Capstone still needs to be able to decode instructions that have them as their operands.
From what I recall
CR1
does exist
Well, it's reserved, sure. Its existence is a matter of speculation. It's not merely inaccessible—there is a profound lack of evidence supporting the hypothesis that it takes up any more space on silicon than the rest that are specified to throw #ud
.
Capstone still needs to be able to decode instructions that have them as their operands.
I can agree. But entertain this: what exactly will Capstone disassemble it into? Architecturally speaking, referring to cr1
is ill-formed. It wouldn't exactly be x86. 😛
I can agree. But entertain this: what exactly will Capstone disassemble it into? Architecturally speaking, referring to
cr1
is ill-formed. It wouldn't exactly be x86. 😛
If these registers are addressable, they are encode-able and Capstone should be able to disassemble instructions with such funny operands. For example, mov rax, cr9
assembles to 44 0F 20 C8
even though we both agree that CR9
is hollow. Since these instructions don't make much sense, they will generate a #UD
once executed. In fact, the SDM is full of wierd instructions / operands / modes combos that result in #UD
for being architectural nonsense.
CR0-15 and DR0-15 are encodable operands, regardless of whether some of them (e.g. CR1, CR5-7, CR9-15, ...) actually exist or not. Instead of "removing" these from the lib, I would encourage an agnostic approach: do not make assumptions about the existence or lack thereof of these registers. We know for a fact that cr0, cr2, cr3, cr4, and cr8 exist; and we know that dr0-3 and dr7 exist (and dr6 depending on conditions). I'd strongly discourage removing these (currently) non-existent registers. They may exist at some point in the future. Just throw a #UD
on them when we encounter them.
Closed due to PR merged.
@ethindp Sorry I missed your comment. Currently, we skip these registers to keep compatible with former versions. If some of them are available later, we can easily add them back.
Available control registers should be
CR0
-CR4
,CR8
. The following registers do not exist:UC_X86_REG_CR5
UC_X86_REG_CR6
UC_X86_REG_CR7
UC_X86_REG_CR9
UC_X86_REG_CR10
UC_X86_REG_CR11
UC_X86_REG_CR12
UC_X86_REG_CR13
UC_X86_REG_CR14
UC_X86_REG_CR15
See:
Available debug registers should be
DR0
-DR7
(thoughDR4
andDR5
are not available, they are defined by the arch) The following registers do not exist:UC_X86_REG_DR8
UC_X86_REG_DR9
UC_X86_REG_DR10
UC_X86_REG_DR11
UC_X86_REG_DR12
UC_X86_REG_DR13
UC_X86_REG_DR14
UC_X86_REG_DR15
See:
The following registers do not seem to exist
UC_X86_REG_EIZ
UC_X86_REG_RIZ
UC_X86_REG_FP0
UC_X86_REG_FP1
UC_X86_REG_FP2
UC_X86_REG_FP3
UC_X86_REG_FP4
UC_X86_REG_FP5
UC_X86_REG_FP6
UC_X86_REG_FP7