tkchia / gcc-ia16

Fork of Lambertsen & Jenner (& al.)'s IA-16 (Intel 16-bit x86) port of GNU compilers ― added far pointers & more • use https://github.com/tkchia/build-ia16 to build • Ubuntu binaries at https://launchpad.net/%7Etkchia/+archive/ubuntu/build-ia16/ • DJGPP/MS-DOS binaries at https://gitlab.com/tkchia/build-ia16/-/releases • mirror of https://gitlab.com/tkchia/gcc-ia16
GNU General Public License v2.0
179 stars 13 forks source link

How's structure and pointer to structure passed to a function in gcc-ia16? #65

Closed ladmanj closed 3 years ago

ladmanj commented 3 years ago

I am porting a little bit of foreign code and it is passing fairly large struct (15x16bit word) created by pushing data on stack by asm and then the C function is called, which is expecting pointer to a struct as one and only argument.

The original code was for ordinary 32bit gcc hopefully.

I have doubts whether this can work with gcc-ia16 as well.

How's structure and pointer to structure passed to a function in gcc-ia16?

tkchia commented 3 years ago

Hello @ladmanj,

How's structure and pointer to structure passed to a function in gcc-ia16?

Not much differently from x86-32 GCC, actually. By default (cdecl calling convention), both structures and pointers to structures are passed onto the stack, and the caller needs to clean up the stack afterwards. gcc-ia16 also supports

You can try compiling the program below with ia16-elf-gcc -S -O3 to see what output the compiler generates.

typedef struct
  {
    int a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p;
  }
some_struct_t;

int f1 (some_struct_t s);
int f2 (some_struct_t *ps);
int f3 (some_struct_t __far *ps);
int f4 (some_struct_t *ps) __attribute__ ((stdcall));
int f5 (some_struct_t *ps) __attribute__ ((regparmcall));

int f6 (int a)
{
  some_struct_t s
    = { a, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, a + 1};
  return f1 (s);
}

int f7 (int a)
{
  some_struct_t s
    = { a, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, a + 1};
  return f2 (&s);
}

int f8 (int a)
{
  some_struct_t s
    = { a, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, a + 1};
  return f3 (&s);
}

int f9 (int a)
{
  some_struct_t s
    = { a, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, a + 1};
  return f4 (&s);
}

int f10 (int a)
{
  some_struct_t s
    = { a, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, a + 1};
  return f5 (&s);
}

Thank you!

ladmanj commented 3 years ago

Thank you for the summary. Now I was able to found the bug in my asm and the output code listing now makes sense.

Unfortunately there is a new problem - the ia16-gcc compiled program is run by a interrupt handler. The handler saves state of the whole processor and passes the control to the C function, but the function is compiled in the way it assumes a particular value of DS register to find its data (code is in ROM, data both in ROM and RAM).

I am x86 asm newbie, so maybe I'm wrong, but my .data segment begins at 0000:0400 and the .text is at f000:a700. If i understand the code correctly, it assumes DS=0xa700 as well:

fb0b9:       c7 06 00 5d 05 00       mov    WORD PTR ds:0x5d00,0x5
fb0bf:       bb 00 5d                mov    bx,0x5d00
fb0c2:       8b 44 14                mov    ax,WORD PTR [si+0x14]
fb0c5:       89 47 02                mov    WORD PTR [bx+0x2],ax
fb0c8:       89 57 04                mov    WORD PTR [bx+0x4],dx

if 0x5d00 is signed displacement, then ds:0x5d00 is pointing to 0x400, if and only if DS value is as i wrote before 0xa700.

I can't find any piece of code which sets this (or any) value of DS (permanently, because some functions are saving the DS on stack and then restoring it).

I can of course set the value in the interrupt handler by myself, but is it correct? What is the usual way the C compiled code gets the right values in the segment registers?

Thank you very much

tkchia commented 3 years ago

Hello @ladmanj,

Can you tell me more details about the code you are trying to port? If you tell me more, I might be able to give more useful suggestions.

Meanwhile, writing ROM code with gcc-ia16 is probably best considered an advanced topic, if you ask me. You will need to do much more legwork than writing just a normal MS-DOS application --- most probably you need to write your own linker script as well.

My humble advice is, if you would like to gain a better idea of how x86-16 addressing works, perhaps you can first try writing up something that is easier to debug, then work from there. 🙂

I can of course set the value in the interrupt handler by myself, but is it correct? What is the usual way the C compiled code gets the right values in the segment registers?

(This is for real mode; protected mode works a bit different, though not that much different.)

In general, an interrupt service routine (ISR) can be called from almost anywhere, e.g. the BIOS, or MS-DOS (or some other OS), or some Terminate-and-Stay-Resident program, etc. These can reside anywhere in the available 1 MiB conventional memory, which means the ISR can in general be called with any value of ds, es, ss, etc.

(The interrupted code's cs, ip, and flags register are pushed onto the stack (pointed to by ss:sp; sp is decremented), then the processor disables interrupts (sets IF in flags to 0) and switches cs:ip to point to the ISR, so it can start running.)

So yes, the normal procedure is that the ISR will set ds to point to its own data segment, each time it is triggered. It also needs to preserve the interrupted code's ds --- probably on the (unknown) stack --- so that, when it is finished, it can correctly resume whatever interrupted program was running. The ISR can either decide to switch to a different stack (by changing ss:sp), or continue using the unknown stack (but any compiled code needs to be specifically written to handle this).

Note: when addressing a data (or code) item through a segment:offset pair,

So for example, if we want to refer to a data structure at 0xf000:0xefc7, we might do so by setting es:bx = 0xf000:0xefc7, which means es = 0xf000 and bx = 0xefc7.

Thank you!

ghaerr commented 3 years ago

Hello @ladmanj,

What is the usual way the C compiled code gets the right values in the segment registers?

As @tkchia points out, the segmented architecture of the 8086 can make things complicated when writing interrupt routines. To directly answer your question, though, the normal way compiled C code gets the right values in the segment registers is via two ways: 1) the program's startup code, usually written in assembler and placed in a file like crt0.S, sets the segment registers once, and then, other than ES, they're typically not changed (there are exceptions to this which we'll leave out for now). The other way 2) is via "interrupt entry" code, written in assembly, that actually gains control directly from the CPU interrupt processing, and then sets the segment registers as they need to be for the C function(s) to execute.

More details: almost all 8086 C compilers require that segment register SS == DS. This is a bit complicated to fully explain, so we'll leave that alone for now. Also, there are exceptions to that and compiler options to allow SS != DS, but I don't recommend that to start, as one needs to be very familiar with the code generated and other functions called. For gcc-ia16, the ES register doesn't require presetting; the compiler will emit code to save it before using it.

In option 1) above, the crt0.S startup code gets the value for DS/SS either directly from the OS, where it is usually preset, or, in some cases, the startup code can allocate its own segment for DS/SS to point to. For option 2), it is more complicated: one can either "run" on the interrupted program's stack, which has its own issues, because one doesn't know how "deep" it will allow pushing, and also, on many systems it cannot be automatically assumed that SS == DS at interrupt time. If it can be guaranteed by the OS that DS == SS (as in the case of ELKS OS interrupt handler routines), then this option simplifies matters considerably. Otherwise, for interrupt routines, one must write assembly code to switch stacks before calling the C code with SS == DS. The CS register can be used to retrieve a saved DS/SS from the currently executing code segment.

The business of writing code to switch stacks is a bit harder than it might seem. This is because not only do all the CPU registers need to be saved (somewhere, usually not on the interrupted stack), and not only SP needs to be "switched", but also SS; and then SS and DS must be set to the same value. After the interrupt routine, the whole process is reversed, and all the CPU registers are popped, along with SS and DS being set back to their original values.

The ELKS kernel has a stack switching routine that is very well written, although complicated. I wrote an article about how that stack switching works on the ELKS Wiki if you want more information.

Thank you!

ladmanj commented 3 years ago

Hi I have found a control board from year 1988, it's PC incompatible, but it uses 8088 CPU (originaly there was NEC V20, but it was dead). There is RAM 0-0x1fff and EPROM 0x8000-0xffff at the physical address bus. In any case, there isn't other mode than Real Mode.

I have partly reverse engineered the original code from the EPROM with Ghidra. The data segment is set to 0000:0000 and the code segment f000:8000 (it uses last mirror in the physically addressable space). I found a sufficiently large free chunk of EPROM and here i place my code compiled by ia16-gcc. I know where is stack and where are data stored. I have prepared the linker script accordingly. I have patched the ROM based interrupt table (it's copied in runtime to start of RAM) and piece of code which is responsible to decoding special characters receiver from one serial port (82C51 not PC standard) to perform INT 3, when a special character is received, CTRL-Z or CTRL-X in particular.

Now it's possible to interrupt the original code, call the C compiled code and write to and receive from second serial port, then return the control to the original code (after I have added the DS=0xa70=base_address/16 to the interrupt handler).

I'm trying to port the gdb-stub, originally written for x386. gdb is somehow able to debug real mode program, so maybe it'll be able to work with that. I'm filling only the lower portions of the appropriate registers and leaving the non-existent registers untouched, but the data exchanged with the gdb is unchanged.

Maybe it's completely ridiculous. In any case, my goal isn't to completely crack the original use of the device, it's only toy for me. I want to discover how the rest of hardware is used and then replace the whole code with my own one.

Maybe is clear to see now, I have some experience in asm, linkerscripts and so on, but until now i have no experiences with 8086 code. My domain is mostly microcontrollers.

Here is my interrupt code:

dbg_int_handler_3:
; no error code at all
; no interrupt vector, there's no way to to signal it, let's compute it back from the address later
call     dbg_int_handler_common

dbg_int_handler_common:

PUSH    AX
PUSH    CX
PUSH    DX
PUSH    BX
PUSH    SP ; The value stored is the initial SP value
PUSH    BP
PUSH    SI
PUSH    DI
push    ds
push    es
push    ss

; Stack:
; - FLAGS
; - CS
; - IP
; - IP of vector
; - AX
; - CX
; - DX
; - BX
; - SP
; - BP
; - SI
; - DI
; - DS
; - ES
; - SS

mov bp,sp
push    bp  ; this creates what C function sees as pointer to structure with the register data
mov ax,(ENTRY & 0xffff)/16
mov ds,ax   ; C compiled code expect same base of data as the .text section address
call    dbg_int_handler

mov     sp, bp
pop     ss
pop     es
pop     ds
POP DI
POP SI
POP BP
POP AX ; no POP SP here, all it does is ADD SP, 2 (since AX will be overwritten later)
POP BX
POP DX
POP CX
POP AX
add sp,2    ; throw the vector IP
iret        ; return where it was interrupted

Thanks

ghaerr commented 3 years ago

Hello @ladmanj,

Does your code above work?

As I mentioned, when you reset DS to your required value, it is very likely DS != SS, which will likely cause problems in your C interrupt handler, depending on the code produced for it. You can use ia16-elf-objdump -D -r -Mi8086 to see that code, and check whether (%bp) is used to reference any variables - indexing off the BP register uses the SS segment by default, while indexing using SI, DI or BX uses the DS segment. Any code that references your data segment variables via the stack will fail to do what is intended.

Also, the following:

pop ss

performs a stack switch, while the code continues to pop from (I'm guessing you are thinking) your original stack. The only reason this code could ever work is that the SS value popped is identical to SS. Bottom line, the pop ss should be removed, as your code always runs on the interrupted stack, and sets DS != SS for the C routine, which is problematic.

Thank you!

ladmanj commented 3 years ago

Hello @ghaerr,

Yes, this part of code works, at least it seems to. I am continuously watching output of objdump and trying to make sense of all i see there.

If the earlier piece of code calls this testing function:

void dbg_int_handler(struct dbg_interrupt_state *istate)
{
    uint16_t i;
    dbg_io_write_8(SERIAL_PORT + 1, 0x15); /* TXEN, RXEN, Clear ERR */
    for(i=0;i<256;i++)
    {
        dbg_serial_putchar(i & 0xff);
    }
    /*dbg_interrupt(istate);*/
    dbg_io_write_8(SERIAL_PORT + 1, 0x0);
}

I got the right bytes on my serial port (I am checking the hw step by step also, because it was in really bad condition when I started reviving it) and then it successfully returns to original ROM code. I can run it repeatedly by pressing CTRL-Z again and again.

If I left it running through rest of the gdb-stub code (the commented out function), there is still some problem, because the packet it should send to the serial port is somehow damaged, my minicom terminal receives $ what's right, but then one or more unprintable characters. Maybe it's wrong work with the serial port again, or there is problem related to this discussion.

Many thanks

tkchia commented 3 years ago

Hello @ladmanj,

Most probably, as @ghaerr pointed out, you need to switch stacks. Check out my implementation of the abort () function for the tiny memory model in my fork of newlib-ia16.

(This is in AT&T assembly language syntax, by the way: the destination operand of each mov instruction is on the right side; immediate operands are marked with $.)

The abort () code tries to work properly even if %ss:%sp happens to be wrong, and to achieve this, it switches to a special pre-allocated "abort stack".

Thank you!

ladmanj commented 3 years ago

Hello @tkchia,

Unfortunately I don't understand the abort code mentioned. I don't know if it's used when going to new code or returning to the old code.

I don't know why i should switch the stack, if i know where the stack is, that the SP is ok and that there is most probably good amount of room for my C program.
The whole memory size of the system is less than 65535 bytes, so the segment switching makes no real sense.

But because I am attempting to port the gdb-stub, i want to preserve the original settings as far as possible.

Thanks for additional comments

ghaerr commented 3 years ago

I don't know why i should switch the stack

Because the compiler emits code that uses both BX and BP in indirect addressing modes, which use different segment registers. SS must equal DS, please re-read my discussion above. Read up on addressing modes in section 1.2.2, 1.2.3, an 1.2.4 in https://www.ic.unicamp.br/~celio/mc404s2-03/addr_modes/intel_addr.html. As we have mentioned, 8086 addressing using DS and SS is more complicated than it seems.

ladmanj commented 3 years ago

Hello @ghaerr ,

But there is problem - i can't place the stack where i want, there is only one place in the small RAM, where it makes sense. The C compiler and the linker script is set-up in such way that it expects DS=0xa70, and then it uses long offset 0x5d00 to reach the RAM at 0x400, but I can't place the stack at the same segment, because there is ROM.

That's why I am asking how to instruct the compiler or linker to use more appropriate segments for my hardware. :-( Thanks

ladmanj commented 3 years ago

Hello @tkchia and @ghaerr,

Maybe I have solution, if there isn't any additional catch waiting around the corner. The BASE_ADDRESS is 0xfa700, the code is physicaly located at address 0xa700, that is 0x2700 bytes from the EPROM start. The free RAM for the C global and static variables is at physical address 0x400. The original DS and SS values are 0xff00 and at the program start the SP is set to 0x1800, the stack TOP is 0xff000+0x1800=0x100800 and this wraps around to 0x800. If I set the DS=SS=0xfa70 and add 0x4900 to SP, the stack TOP will be now 0xfa700+0x4900=0x100800 thus 0x800. The RAM variables are in the C routines accessed by offset 0x5d00, 0xfa700+0x5d00=0x100400 thus 0x400. I think I ate a cake and still have it, or as we Czechs say The wolf is full and the goat remains whole. Thank you!

ladmanj commented 3 years ago

Still not good :-( const char data which is placed at section .rodata is unreachable for one function at least.

ladmanj commented 3 years ago

Hello @tkchia, Is there some documentation, what can be modified in the linkerscript for this particular platform (ia16)? Thank you!

tkchia commented 3 years ago

Hello @ladmanj,

Unfortunately I don't understand the abort code mentioned. I don't know if it's used when going to new code or returning to the old code.

Which particular parts of the code do you have trouble understanding?

I believe you might also occasionally need to deal with using multiple stacks when programming microcontrollers with flat address spaces (I also did a bit of μC programming back in the day).

Suppose in a μC routine, you want to switch your stack pointer (let us call it sp) to a stack at a known address, say my_stack, within some μC code, but you would like to be able to switch back to the previous sp later. (And suppose, for whatever reason, you cannot simply switch between register banks to get different sp's.) Obviously you cannot just write something like

  ;; this is wrong
  mov sp, #my_stack  ; switch to new stack
  push sp            ; push old stack pointer

because by the time you reach the push sp, the previous sp value would have been lost forever. So what can you do to switch to a different stack and store the original stack pointer on the new stack?

On x86-16, the main difference is that you need to take care of two registers ss and sp, not just one sp.

Thank you!

ladmanj commented 3 years ago

Hello @tkchia,

  ;; this is wrong
  mov sp, #my_stack  ; switch to new stack
  push sp            ; push old stack pointer

because by the time you reach the push sp, the previous sp value would have been lost forever. So what can you do to switch to a different stack and store the original stack pointer on the new stack?

On x86-16, the main difference is that you need to take care of two registers ss and sp, not just one sp.

Oh now I see now what the message was. I understand. Can't switch SP and SS in random order and rely on the data (SS*0x10+SP) is pointing to.

That's why I basically don't want to switch the DS, SS, ES, SP registers at all and I'm looking for a way, how to tell the linker to accommodate to the original values, which I can pass to linkerscript or anywhere else.

The registers are saved for different reason (not in general, but for my particular project) - the registers are pushed onto stack to copy it to "user space" and pass it via the serial line to PC to examine it and alter it in a intelligible way.

There is no practical reason to switch the memory context, because there is so small amount of RAM and it's occupied by the old analysed program.

I would be happy if i can tell the compiler and/or linker, to use the original DS=SS=0xff00, the ES is probably never used in the original code so maybe it can be placed anywhere, but I'm far from being sure.

Thank you!

tkchia commented 3 years ago

Hello @ladmanj,

I'm looking for a way, how to tell the linker to accommodate to the original values, which I can pass to linkerscript or anywhere else.

You might be able to get away to making sure that (most) data items --- including read-only data --- can be accessed from the 0xff00 segment. There are a few ways; you can mix them:

I will talk about the linker script a bit later.

Thank you!

tkchia commented 3 years ago

Hello @ladmanj,

Is there some documentation, what can be modified in the linkerscript for this particular platform (ia16)?

The main problem in writing a linker script for IA-16 is that you need to worry about segments. Right now the toolchain supports two schemes for dealing with IA-16 segments, which I describe in a write-up.

In the (for now) default scheme, in the linker script you basically define each IA-16 segment with LMA = its absolute memory location, and VMA = its offset within its segment (i.e. normally 0, unless it is part of another IA-16 segment). So you might write e.g.

…
SECTION
{
  …
 .data 0 : AT (0xff000)
  {
    *(.rodata .rodata.* .gnu.linkonce.r.*) …
    *(.data .data.* .gnu.linkonce.d.*) …
  }
  …
} …

Again, as I said, writing ROM code with gcc-ia16 is probably best considered an advanced topic.

Thank you!

tkchia commented 3 years ago

You might be able to get away to making sure that (most) data items --- including read-only data --- can be accessed from the 0xff00 segment.

Note that, if there is even the slightest chance that your code might get called with ss0xff00, then you cannot use this trick, and you must allocate and use a separate stack. Take note.

Thank you!

ladmanj commented 3 years ago

Hello @tkchia,

* place read-only data on your ROM in the address range `0xff000`---`0xfffff`;

* arrange to copy read-only data from your ROM to RAM, before running your code;

Am I right when I think that the RAM is repeated 0x0-0x1fff, 0x10000-0x11fff, ... 0xf0000-f1fff? And ROM at 0x8000-0xffff, 0x18000-0x1ffff, ... 0xf8000-0xfffff? I think I am. The RAM is physically connected to 8088's A[13:0] and CS\=A15 and ROM is at A[14:0] and CS\=not(A15) and A[19:16] are unconnected.

The upper 4 bits of the physical address is completely ignored, so only the least significant byte of segment register is effective. Isn't it?

I think i can reach any physically connected memory location with any content of segment register, with appropriate offset. Am I right?

In any case - I can't remap the ROM nor RAM to another addresses, because the original code which I depend on needs this layout. The original code knows how to deal with rest of hardware and I don't. I want to use its pieces from my new code maybe later. For now I'm counting on that the peripherals was initialized by the original code, I can't initialize it by myself, I have no complete schematic of the PCB, i know only the minimum, where are the serial ports and where are the memories.

Thank you

ghaerr commented 3 years ago

Hello @tkchia,

Given that @ladmanj's issue seems to be that, if possible, he would just like to execute a C routine at interrupt time, without changing any segment registers, the following came to mind:

Over at ELKS when the fast serial driver was being written, a similar issue arose; in this case we needed a very fast way of reading serial data, but without the overhead of ELKS' stack-switching code. What ended up being written was ASM code that set the DS register to kernel DS, and then the C interrupt handler called with no arguments. With careful coding, the compiler never emitted indexed instructions using BP, and the routine worked.

What I am thinking now, is essentially the reverse - a compiler option and/or keywords that could be used to allow @ladmanj's C routine to run, completely off the interrupted stack SS and SP, and without any instructions requiring DS. In this scenario, DS would not get set prior to the C routine, and the generated code would use SS: overrides whenever BX, SI or DI indexed modes were emitted. There would be no access to global variables.

Would the compiler options you added in January work for this? Something like:

__attribute__((no_assume_ss_data)) void dbg_int_handler(struct dbg_interrupt_state __seg_ss *istate) { ...}

The above options will instruct the compiler to emit SS: overrides on all BX, SI and DI indexed references, correct?

The ability for the compiler to produce code for a procedure that could run with SS != DS, but without access to globals, would be a nice option to have. If global variable access were needed, a data segment could be accessed in ASM code and passed as a parameter (and __far pointers used later), while the rest of the routine would run off the current stack.

Thank you!

tkchia commented 3 years ago

Hello @ghaerr,

Would the compiler options you added in January work for this? Something like:

__attribute__((no_assume_ss_data)) void dbg_int_handler(struct dbg_interrupt_state __seg_ss *istate) { ...}

The above options will instruct the compiler to emit SS: overrides on all BX, SI and DI indexed references, correct?

Yes, that will work. The downside is that all of dbg_int_handler(.)'s callees will also need to be no_assume_ss_data, and if any globals are accessed (even if they are just read-only strings!), then bad things happen.

On second thought, actually it is perfectly OK to set %ds to the correct data segment, without setting %ss. Then

(Side note: in fact a few months back I sent a pull request (https://github.com/FDOS/kernel/pull/20) to the FreeDOS kernel project to ask to use this new feature in FreeDOS. It is still pending though.)

Thank you!

tkchia commented 3 years ago

Hello @ladmanj,

Am I right when I think that the RAM is repeated 0x0-0x1fff, 0x10000-0x11fff, ... 0xf0000-f1fff? And ROM at 0x8000-0xffff, 0x18000-0x1ffff, ... 0xf8000-0xfffff?

No, no, no! I am not sure how your specific machine is wired up. But at least on a standard IBM PC, all 20 bits of the physical address are used. The IBM PC-compatible memory layout is something like this:

Address Use Remarks
0x000000x9ffff RAM Conventional memory (including interrupt vectors at address 0x0)
0xa00000xaffff Video EGA/VGA video memory
0xb00000xb7fff Video Monochrome (MDA) video memory
0xb80000xbffff Video CGA video memory
0xc00000xeffff ROM/RAM Expansion ROMs / expanded memory
0xf00000xfffff ROM System ROM BIOS

So on an IBM PC-compatible machine, there is at least a 1-MiB-large address space you can access through real mode, of which the first 640 KiB or so are for conventional memory.

You can think of a segment value as providing a 64 KiB "window" to the entire 1 MiB real mode address space. There are 4 such "windows" — cs, ds, es, and ss — and by sliding each window, your program can access a 64 KiB "view" of this entire 1 MiB.


Seriously, it seems to me you are trying to port code from, say, platform A to platform B, when you are still not very clear on what the original A code does (or is supposed to do), or how platform B really works, and you are insisting on flying blind instead of using a debugger to check your understanding of things. I am not sure this is a good approach.

There are x86-16 debuggers and emulators which allow you to single-step through code — your own code and others' code — to see exactly what they do, and to see how the x86-16 architecture works. I strongly advise you to make full use of them.

Thank you!

ghaerr commented 3 years ago

Hello @tkchia,

Out of curiosity, what does the __seg_ss qualifier do, and why is it also needed, when __attribute__((no_assume_ss_data)) is used?

Are there any cases where __seg_ss would be useful without no_assume_ss_data?

Thank you!

tkchia commented 3 years ago

Hello @ghaerr,

Out of curiosity, what does the __seg_ss qualifier do, and why is it also needed, when __attribute__((no_assume_ss_data)) is used?

If a function argument happens to be a pointer, then it means that the thing it points to is also to be addressed through %ss. Without __seg_ss, it means only the pointer variable is %ss-based, while the pointed-to thing is still considered to be on the data segment.

(This works the same way as other type qualifiers, e.g. const char * vs. char * const vs. const char * const.)

Are there any cases where __seg_ss would be useful without no_assume_ss_data?

Not really.

Thank you!

ghaerr commented 3 years ago

@ladmanj,

Am I right when I think that the RAM is repeated 0x0-0x1fff, 0x10000-0x11fff, ... 0xf0000-f1fff? And ROM at 0x8000-0xffff, 0x18000-0x1ffff, ... 0xf8000-0xfffff? I think I am. The RAM is physically connected to 8088's A[13:0] and CS=A15 and ROM is at A[14:0] and CS=not(A15) and A[19:16] are unconnected.

If what you are saying is that the upper 4 (of 20) address lines are not connected, and the lower 14 address lines are going to a RAM chip, and the lower 15 lines going to ROM, and A15 being used to select between ROM and RAM, then yes, the RAM and ROM will be appear repeated (differently) based on the physical address modulo 16384 and 32768 for each, although slightly differently than your hex numbers above.

The upper 4 bits of the physical address is completely ignored, so only the least significant byte of segment register is effective. Isn't it?

If you mean the upper 4 bits of a 20-bit address, then yes.

I think i can reach any physically connected memory location with any content of segment register, with appropriate offset. Am I right?

It appears that since your hardware only uses a maximum of 16 address bits to access either ROM or RAM, then yes, any segment value could technically be used to access any ROM or RAM location with an appropriately devised offset.

HOWEVER - none of this matters, in my opinion. We should not be trying to guess segment register offsets that magically might work. Instead, leave the segment registers alone, and code the C interrupt handler so that it works with the interrupted segment register values (that is, no change to them).

It appears the real problem you're trying to solve is to have the system operate completely normally, while rewriting the INT 3 vector to allow you to breakpoint various routines. Your new INT 3 vector will save all the registers, then call a handler that will essentially interact with a user to allow register contents to be displayed out the serial port. Right?

All that is needed is for your C interrupt routine to function properly, which it has not. The reason not has nothing to do with trying to guess segment register offset values, etc, but instead that the emitted code assumes the SS segment equals the DS segment. With the __attribute__((no_assume_ss_data)) confirmed by @tkchia, this can be made to simply work:

So, after this long discussion, we have determined that there is a way to make this work: ALL of your C routines need to be declared __attribute__((no_assume_ss_data)), and pointer arguments qualified with __seg_ss. This includes both dbg_int_handler and dbg_io_write, as well as any other C routines called.

Do that and your problems will be solved.

P.S.: Remember to change the "pop ss" in your interrupt ASM to "pop ax" (to discard SS stack switch) for good measure.

Thank you!

ladmanj commented 3 years ago

Hello @tkchia,

No, no, no! I am not sure how your specific machine is wired up. But at least on a standard IBM PC, all 20 bits of the physical address are used. The IBM PC-compatible memory layout is something like this: Address Use Remarks 0x000000x9ffff RAM Conventional memory (including interrupt vectors at address 0x0) 0xa00000xaffff Video EGA/VGA video memory 0xb00000xb7fff Video Monochrome (MDA) video memory 0xb80000xbffff Video CGA video memory 0xc00000xeffff ROM/RAM Expansion ROMs / expanded memory 0xf00000xfffff ROM System ROM BIOS

My platform has nothing to do with PC of any kind!!! Only common part is that as CPU there is i8088 used. No normal PC memory arrangement! No normal PC peripherals! No normal PC bios! (And that's good, I don't like old PC's :-D )

Thanks

ladmanj commented 3 years ago

Hello @ghaerr,

So, after this long discussion, we have determined that there is a way to make this work: ALL of your C routines need to be declared __attribute__((no_assume_ss_data)), and pointer arguments qualified with __seg_ss. This includes both dbg_int_handler and dbg_io_write, as well as any other C routines called.

Do that and your problems will be solved.

P.S.: Remember to change the "pop ss" in your interrupt ASM to "pop ax" (to discard SS stack switch) for good measure.

Great! I'm looking forward to try it.

Thanks to both of you.

tkchia commented 3 years ago

Hello @ladmanj ,

My platform has nothing to do with PC of any kind!!! Only common part is that as CPU there is i8088 used.

Oh, OK. Apologies for my misunderstanding.

I would still advise you to enlist the help of a debugger or emulator though, even if the (emulated) PC environment is not an exact match to your hardware's. Debuggers are your friends.

Thanks, and good luck!

ladmanj commented 3 years ago

Hi @ghaerr , @tkchia

I don't know if the universe ends in a huge explosion if i write to a closed topic, but i will try it.

At first I'm not very happy to must add the attribute to every single function, i didn't found any gcc command line option to make it default, but ok. I am looking for something non-standard and I'm getting something non-standard.

But there is problem, when I add the attributes and modifiers I get this error message:

dbstub.c:68:77: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token
 __attribute__((no_assume_ss_data)) int dbg_send_signal_packet(char __seg_ss *buf, size_t buf_len, char signal);
                                                                             ^

When only the attribute is present and not the modifier of the pointer:

gdbstub.c: In function ‘dbg_send_packet’:
gdbstub.c:314:19: error: passing argument 1 of ‘dbg_enc_hex’ from pointer to non-enclosed address space
  if ((dbg_enc_hex(buf+1, sizeof(buf)-1, &csum, 1) == EOF) ||
                   ^~~

Practically I'm unable to compile the project wit these recommended options.

Please help.

Thanks J.

tkchia commented 3 years ago

Hello @ladmanj,

dbstub.c:68:77: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token
 __attribute__((no_assume_ss_data)) int dbg_send_signal_packet(char __seg_ss *buf, size_t buf_len, char signal);

Are you compiling in C++ mode (ia16-elf-g++) or using the host compiler (gcc)?

Thank you!

ladmanj commented 3 years ago

The important parts of the Makefile

CC           = /home/ladmanj/DataDisk/src/build-ia16-master/prefix/bin/ia16-elf-gcc
CFLAGS       = -Werror -ansi -Os -ffunction-sections -fno-stack-protector -I$(ARCH) -I$(PWD
CFLAGS  += -m32
LDFLAGS += -m elf_i386

%.o: %.c
    $(CC) $(CFLAGS) -o $@ -c $<
tkchia commented 3 years ago

Hello @ladmanj,

OK, I see what might be the problem. Try removing the -ansi flag from CFLAGS. Thanks!

ladmanj commented 3 years ago

It helped a little bit, but there are plenty of other bugs.

For example:

gdbstub.c:630:25: error: passing argument 1 of ‘dbg_send_packet’ from pointer to non-enclosed address space
  return dbg_send_packet("OK", 2);
                         ^~~~

It dissappears when I remove the __seg_ss from the pointer argument of:

__((no_assume_ss_data)) int dbg_send_packet(const char __seg_ss *pkt, size_t pkt_len);

but then different error pops in elsewhere.

I think I'm giving it up. This is no fun any more :-(

Thanks for everything

ghaerr commented 3 years ago

Hello @ladmanj,

It dissappears when I remove the __seg_ss from the pointer argument of:

The reason is that you've declared the argument to be a stack-only addressable pointer, but then you're passing "OK", which is data-segment only addressable.

I think I'm giving it up. This is no fun any more :-(

Well, it was warned that the 8086 segment architecture is harder than it appears. One must always be aware of which segment one is addressing from, SS, or DS. The normal C compilers require SS == DS, which means there are no worries. The option you are working with is very nice to have, but requires careful thinking.

My advice now is the same as @tkchia's: take it a little slower, try some simpler routines before jumping in with both feet! Get something smaller working, very small, then add to it slowly.

Thank you!

ladmanj commented 3 years ago

Hello @ghaerr,

Thank you very much.

I have now a new idea. Turn it upside down. Let the ia16-gcc to do what it want to do, and backport the DS=ES=SS value it counts on to the original code, by patching relatively small amount of places in the binary.

Again it would be impossible in the general situation, but here it's perhaps possible with the modulo 2^16 memory wrap.

Thanks

ladmanj commented 3 years ago

Hello @tkchia, @ghaerr,

Until now I have only asked how to set the segment registers differently from the ia16-gcc's defaults, but now i need to know what are exactly the defaults.

My linker script core is

MEMORY
{
    RAM (!RX) : ORIGIN = DATA_ADDRESS, LENGTH = DATA_LENGTH
    ROM (RX) : ORIGIN = BASE_ADDRESS, LENGTH = BASE_LENGTH
}

SECTIONS
{
    .text : {
        *(.text.dbg_start)
        *(.text)
        *(.rodata)
    } > ROM

    .data : {
        *(.data)
        *(.bss)
        _load_end_addr = .;
        _bss_end_addr = .;
    } > RAM

What (and which) segment registers content is expected (relative to the symbolic addresses above)?

I have feeling that all segment registers (CS DS ES SS) are expected to be BASE_ADDRESS/16.

Thank you

ladmanj commented 3 years ago

Hello @tkchia, @ghaerr,

Maybe little better question is, how should a good crt0.s look like for running ai16-gcc compiled code on bare metal (with no operating system).

Thank you very much

lpsantil commented 3 years ago

crt0.s is intended to take over from the OS binary executable loader and setup everything (command line args for main(), the stack, the heap, etc.) needed to start a C program. If there is no OS, then it is up to you.

DOS ".COM" binaries assume Segment Offset of 0x100 ([CS:0x100]). DOS ".EXE" (potentially different from Windows "*.EXE") can various memory models which affect the startup.

I'd suggest reading thru Dunfield's EMBEDPC package and his other tools, OSDEV, Intel's x86 ISA & Memory Models, and other OS Tutorials.

ladmanj commented 3 years ago

Hello @lpsantil,

Misunderstanding alert!

I see that the terminology isn't as common as I was expecting :-(

I'm also using sdcc compiler for Z80 cpu and there is standard crt0.s in library. Which is linked to the project by default and in some cases it's replaced by the user's version.

The standard one counts on some specific startup scenario, reset and run from address 0. There is unconditional jump to 0x100 (to skip interrupt vectors) and there's routine for initialization of initialized C variables and then C main function is called. When main function returns, there's a endless loop to protect the machine from undefined behavior.

The 8088 i have here starts at 0xffff0, but except that, the procedure may be similar. I'm asking to such a piece of "recommended" code.

Thank you

tkchia commented 3 years ago

Hello @ladmanj,

The 8088 i have here starts at 0xffff0, but except that, the procedure may be similar. I'm asking to such a piece of "recommended" code.

I am not sure if the sort of thing you are specifically looking for exists --- i.e.

From what I understand, this is partly for historical reasons. Back in the day, shortly after IBM introduced the IBM PC, pretty much every computer manufacturer using Intel x86 chips decided it wanted to be IBM PC compatible. (A 1984 issue of Byte magazine complained about this. Yes.)

The result, from what I see, is that

Thank you!

tkchia commented 3 years ago

Hello @ladmanj,

Meanwhile you might want to check out Jamie Iles's IBM-compatible BIOS implementation, which he had authored as part of his S80186 hardware project.

This is not exactly the same as what you are looking for, but it should (hopefully) be simpler to understand than most other x86 code meant for the ROM.

Thank you!

ladmanj commented 3 years ago

Hello @tkchia, I will definitely look at the recommended pages, but in meantime i have a little bit tuned system. Can you review it if all problems are solved from your point of view? I can call the C functions and succesfully return to the original rom code. The C code can successfully parse the register data stack copy.

The called C code is compiled with default options no function attributes nor pointer modifier is used.

I haven't done any attempt to initialize variables, which are expected to be initialized by C compiled code. How can I figure out what is needed to copy wherefrom whereto?

Thank you very much

dbg_int_handler_common:

    PUSH    AX
    PUSH    CX
    PUSH    DX
    PUSH    BX
    PUSH    SP ; The value stored is the initial SP value
    PUSH    BP
    PUSH    SI
    PUSH    DI
    push    ds
    push    es
    push    ss

                    ;ss:sp points to 0xff000+0x17d4 = 0x7d4
    mov bp,sp 
    mov bx,ss   ;0xff00 old data segment
    mov ax,cs   ;0xfa70 new data segment
    sub bx,ax
    mov cl,4
    shl bx,cl   ;bx = 0x4900 needed to wrap to the same physical stack location in RAM 
    add bp,bx   ;+0x4900
    push    bx
    push    bp      ;must be done before changing SS or SP
    mov ds,ax 
    mov es,ax
    mov ss,ax
    add sp,bx   ;+0x4900
                    ;ss:sp points to 0xfa700+0x60d4-2-2 = 0x7d0
    call    dbg_int_handler

    pop bp
    pop bx
    sub bp,bx
    mov sp, bp
    mov cl,4
    shr bx,cl
    mov ax,ss
    add ax,bx
    mov ss,ax
                    ;ss:sp points to 0xff000+0x17d4 = 0x7d4 again
    pop ss
    pop es
    pop ds
    POP DI
    POP SI
    POP BP
    POP AX ; no POP SP here, all it does is ADD SP, 2 (since AX will be overwritten later)
    POP BX
    POP DX
    POP CX
    POP AX
    add sp,2    ; throw the vector IP
    iret            ; return where it was interrupted
tkchia commented 3 years ago

Hello @ladmanj,

Can you review it if all problems are solved from your point of view?

Well, your code did the job it was supposed to do (it seems). Now the code can be thrown away. What else are you looking for?

How can I figure out what is needed to copy wherefrom whereto?

Take a look at Iles's BIOS source code.

Thank you!

ladmanj commented 3 years ago

Hello @tkchia,

I strive to follow the BIOS example, but I'm not really succeeding.

a) i have translated the at&t syntax to this

extern      bss_start,bss_end,bss_start,data_start,data_end,_data_load
; Clear bss
    mov     di,bss_start
    mov     cx,bss_end
    sub     cx,bss_start
    mov     al,0
    rep     stosb

; Initialize rw data
    mov     di,data_start
    mov     cx,data_end
    sub     cx,data_start
    mov     si,_data_load
    rep     movsb

b) i have modified my linkerscript to define the above symbols

    .text : {
        *(.text.ENTRYPOINT)

        . = ALIGN(4);
        *(.text)
        . = ALIGN(4);
        _data_load = . ;
        *(.rodata)
    } > ROM

    .data :  {
        data_start = . ;
        *(.data)
        data_end = . ;
        . = ALIGN(4);
        bss_start = . ;
        *(.bss)
        bss_end = . ;
    } > RAM

(There are too many rules how to make linkerscript to grasp, so I'm definitely not sure it's ok)

But this leads to this error:

arch_8086/gdbstub_int.nasm:(.text+0x4b): relocation truncated to fit: R_386_16 against symbol `_data_load' defined in .text section in gdbstub.elf`

However if i prepare fake linkerscript, where .rodata is also in RAM, i can see, that the initialized .data is of length zero, thus I'm commenting out the 'Initialize rw data' asm code and then it can be built.

But anyhow this piece of code in C is not working properly:

const char digits[] = "0123456789abcdef";

char dbg_get_digit(int val)
{
    if ((val >= 0) && (val <= 0xf)) {
      return digits[val];
    } else {
        return EOF;
    }
}

I have alternative function, doing the same, not depending on the const char string, and that works, but I want to know what's wrong, because it's not the only function that's defective.

Please help if you can. I know I'm annoying.

Thank you!

tkchia commented 3 years ago

Hello @ladmanj,

I have already previously written to you about a good way to write linker scripts. You may want to read that.

I also previously advised you to use debuggers and emulators. You might want to not ignore this advice.

relocation truncated to fit: R_386_16 means that the code is trying to stuff an address that is beyond 16 bits into a 16-bit operand. So figure out why it is trying to do that.

tkchia commented 3 years ago

Hello @ladmanj,

More generally, please take ownership of your own project. I can help you fix specific problems, but I cannot help you understand the background knowledge needed to make it work. Thank you.