Closed tkchia closed 6 years ago
Hello @tkchia, nice to see you are continuing to work on our favorite compiler !
I know my comment would not be consistent with your current design around register allocation, but I think since the beginning that ES and DS registers are specialized ones and should not be used as general purpose ones like AX...DX & SI...DI.
Especially if you want to implement a memory model where SS != DS. If not holding the data segment in DS, you would have to save it somewhere, and we enter here into the gray zone.
In other words, if you don't preserve segments registers, you link the compiler design with the code execution context and the target OS, and IMHO I think it should be avoided for a compiler that targets many 16 bits OSes.
Hello @mfld-fr ,
Thanks for your feedback! Well, I am thinking of supporting protected mode precisely to allow ia16-elf-gcc
to be usable in more situations, and for more OSes. ☺
I think it is safe to assume --- whatever the target OS --- that an IA-16 routine will either be running in real mode or protected mode. If GCC can also output code for protected mode, I think it can potentially output code for pretty much any environment (the rest is a matter of adapting to the particular OS ABI).
As things now stand, the output code does preserve %es
by pushing and popping it on the stack. However, in between a pushw %es
and a popw %es
, it feels free to store arbitrary temporary values into %es
. E.g.:
create_reloc_table:
pushw %si
pushw %di
pushw %es
pushw %bp
movw %sp, %bp
...
pushw %ax
call malloc
movw %ax, %es
addw $2, %sp
andw %ax, %ax
jne .L21
movw $.LC16, %ax
pushw %ax
call perror
pushw %es
call exit
.L21:
...
pushw %ss
popw %ds
movw %bp, %sp
popw %bp
popw %es
popw %di
popw %si
ret
This does not work when the CPU is in protected mode --- since the CPU will #GP
if something other than a valid selector is loaded into %ds
or %es
.
Hello @mfld-fr ,
I have come up with a patch to allow protected mode output (basically, I created a separate "partial 16-bit integer" operand type (PHImode
) in the back-end, which is restricted to holding values meant for segment registers).
The patch is close to functioning, but it still makes GCC crash on some existing test cases, so it will need a bit more work.
Thank you!
Mmm... in the example you gave above, I agree you preserved %es and %ds, but you assume that %ds = %sp to restore %ds :
pushw %ss
popw %ds
Not only it does not work in protected mode before this step, but it does not work also if the code is intended to work in a memory model where %ss != %ds.
This memory model would be a nice to have, because today we are constrained to have the stack in the same segment as the data, that brings limitations to what the OS could do for powerful memory & task management.
This is why I suggest to not use %ds or %es to store a non-segment value, even in real mode. That would be more simple for the whole.
Hello @mfld-fr ,
This memory model would be a nice to have, because today we are constrained to have the stack in the same segment as the data
Yes --- this is another problem I am still racking my brain over (as witnessed by the discussion at #19). Unfortunately it seems that the issue is more than just about pinning down the value of %ds
--- the problem is, the GCC middle-end currently assumes that automatic (stack) variables live in the same address space as static storage variables, heap variables, etc.
Thank you!
I've been looking at this from the standpoint of protected mode this whole time. I'm pretty sure there's nothing that explicitly prevents DS and ES from being used as general-purpose registers (and in particular DS)—it's just not very efficient for many use-cases.
Hello @zfigura ,
I'm pretty sure there's nothing that explicitly prevents DS and ES from being used as general-purpose registers
To confirm that I had not been remembering things wrongly, I decided to double-check against the Intel documentation on this issue. Page 4-38 (on the mov
instruction) suggests that the CPU does check whatever is loaded into %ds
or %es
, and moreover the check occurs at the time of this load (not later). A mov
instruction will fail with an exception in protected mode
If segment selector index is outside descriptor table limits. ... If the DS, ES, FS, or GS register is being loaded and the segment pointed to is not a data or readable code segment. If the DS, ES, FS, or GS register is being loaded and the segment pointed to is a data or nonconforming code segment, but both the RPL and the CPL are greater than the DPL. ... If the DS, ES, FS, or GS register is being loaded and the segment pointed to is marked not present.
To really confirm that this is how things work, I took a GitHub Gist of mine --- which creates a 16-bit LDT descriptor under (64-bit) Linux and runs stuff from it --- and modified it to load invalid values into %ds
and %es
:
"movw %%cs, %%ax; "
"xorw %%bx, %%bx; " /* <- */
"movw %%bx, %%ds; " /* <- */
"movw %%ax, %%ds; "
"movw %%sp, %%es; " /* <- */
"movw %%ax, %%es; "
Thank you!
(One thing I admittedly did not know, before reading the Intel documentation, is that it is perfectly fine to load a "null" selector value (0x0000
to 0x0003
) into a segment register.)
Thank you!
@tkchia : so please consider not changing DS and ES as a preliminary step for your future hacking of the GCC middle-end. :wink:
For the null selector value, I did not know either, but it looks like logical for easy NULL pointer management.
Hello @mfld-fr ,
so please consider not changing DS and ES as a preliminary step for your future hacking of the GCC middle-end. :wink:
Well, now that there is some actual code making use of this GCC port, I guess I do need to pay a lot more attention to backward compatibility issues. ☺
Thank you!
Currently,
gcc-ia16
outputs code which may not be suitable for running in 80286 protected mode, because the generated code may use%es
and%ds
to hold arbitrary values, and protected mode does not allow this.I have been trying to implement a
-mprotected-mode
option, which --- if it works --- should make the output code use only%es
and%ds
when dereferencing pointers (and perhaps when explicitly specified as an__asm
operand). The resulting code should thus be able to work even in protected mode.Actually implementing this in GCC seems far from straightforward, however.