Looking around... - Githubissues

Hello @mfld-fr,

Sys86 is pretty cool, are you planning on writing a complete OS for your 8086 SoC from scratch?

When I last looked at your code several days ago, I noticed the interrupt handling routine that switches stacks when a per-stack interrupt counter is 1. Keeping a per-stack interrupt counter rather than a system global is an interesting idea, more on that later. But what I did notice is that the stack switching code (in particular SS, not SP), may not really be doing a full stack switch. This is because each task stack resides within sys86's overall data segment and thus SS is identical for all the tasks. Am I seeing this the wrong way? When the tasks themselves reside in completely different (data/stack segments), then the register saving and task switching code may become more complicated.

With regards to the per-task interrupt count, I fail to see how that guarantees the incoming interrupt is from user mode. Doesn't it just guarantee that you know whether the current task has an empty kernel stack or not?

Another comment: since you're checking a per-task interrupt count, and possibly initializing (switching to) its kernel stack (or not, depending in task_level), why push the all the registers on the previous stack? Why not delay until after the check, when in both cases you would guarantee that the registers are saved on the current task's stack, which is always easily addressable by SS/DS? [EDIT: SS only, I see that DS is always changed to the kernel data segment, SS may not equal DS if task stack segment not allocated from kernel data.]

Hello @ghaerr, thank you for had a look here. I will split my response in several parts, as you are asking very good questions where I often don't have a satisfactory response yet.

First of all, I do not intend to write a complete OS for my SBC. For any embedded application, I would rather use a recent and widely supported SBC / SoC. For general applications, a standard PC. Choice of the OS would be more complex because of so many parameters.

I use that SBC (https://www.advantech.com/products/1-2jkp2y/snmp-1000-b/mod_b01abe8d-c8aa-4530-8e3c-9fcd2623d9d6) as an evaluation board, and one goal of this project is to provide a minimal development system that allows to experiment at a low cost some IA16 code in a real and (as far as possible) controlled environment, thanks to a simple HW. The companion project is EMU86 for emulated environment.

My first experiment is to see how far one can go in interrupt & task management within the IA16 capabilities (including the C language for kernel code), in order to solve some issues related to fast (and / or buffer limited) communication devices, like the serial port or the Ethernet adapter, while keeping a generic code for all the devices.

Now the technical discussion itself:

Using INT 80h instruction to enter the kernel ('system call') is actually a synchronous call, being interrupted is actually an asynchronous call. Hereafter, 'system call' and 'true interrupt' are used to distinguish the two cases.
The level member of the task structure is not a 'true interrupt counter', but rather a counter that tells if the task is entering, or already in the kernel mode (= using the kernel stack), when either truly interrupted or calling the system. The true interrupt counter is the int_count global variable. This latest can still global as long as task switching is forbidden in nested true interrupt.

_For the moment, the difference between the two is not obvious as there is no code yet in int-proc to increment int_count only on real interrupt (as system call not yet implemented)._

Task switching can occur not only on system call exit when returning to user mode, or when calling task_switch in kernel mode (both as in ELKS), but also on non-nested true interrupt exit, when returning to either user mode or to kernel mode.

For now, the test code start tasks in kernel mode only. Next step will be to start some more tasks in user mode.

Because the true interrupt handler can make one kernel-mode task to be suspended and another user-mode task to be resumed (and vice-versa), a full context save & restore is needed, including SS and DS as user and kernel data segments are different. Therefore, task_switch also has to save the full caller context.
Moreover, the interrupt stack has to be empty before restoring the continued context, otherwise it would grow indefinitively. This is the main reason why the task switch is only performed on interrupt exit.
To allow calling task_sched in C code, and decouple the scheduling policy from the interrupt handling, the task switching is delayed with the task_prev & task_next variables until the system call / true interrupt exit.
To avoid rescheduling on every task wake up in nested interrupts, it is delayed with the sched_need variable, and disabled with the sched_lock variable, until the end of the first interrupt handling.

This scheduler lock, if atomic, could also be used to protect some non reentrant kernel code...

In place of saving the registers to the caller stack and having to use a far pointer to get their values on system call, the current implementation can be modified to save them in the task descriptor, like in ELKS, to use a near pointer. But it would cost more than a simple INT80h / PUSHA / PUSH DS / PUSH ES, and as told in https://github.com/jbruchon/elks/issues/729#issuecomment-682411162, it would consumes more kernel data space.

Thanks for the long writeup. I have many thoughts about this all, and being one of my favorite subjects, we have a lot to talk about. I also will put my response into multiple posts, since they are lengthy.

I was going to start by getting into each of these issues, and what I think about their design versus, say ELKS/Linux/UNIX which I'm deeply into right now. But thinking about what you've written about, I suddenly realized something, and want to take a step back to a higher level first:

Task switching can occur not only on system call exit when returning to user mode, or when calling task_switch in kernel mode (both as in ELKS), but also on non-nested true interrupt exit, when returning to either user mode or to kernel mode.

This is the essence of what you're trying to achieve, right? I initially thought you were looking for the ability to allow a task switch at anytime, that is, very soon after any interrupt. If, instead, only a task switch after a non-nested interrupt, is much easier. I guess I was thinking about SMP-like systems, where any code can be executed at any time (almost).

I would like to ask the bigger question of "why"? Why do you need a task switch after a non-nested true interrupt exit? What, exactly, will that enable one to do, which isn't possible otherwise? This is important, as form follows function and I still have many points to make about the design chosen in sys86. Is it just to switch to mon86 on a trap interrupt?

No design change in ELKS is required, in fact only a few lines of code are required, to implement a task switch after a non-nested interrupt. All of the technical rewrite of sys86 is not actually required. Only a few lines change in ELKS irqtab.S and it's done (of course, the kernel will promptly crash, because none of it is protected from reentrancy). Here's the code executed at the very end of every interrupt in ELKS:

//
//      Restore intr_count
//
        decw    intr_count
//
//      Now look at rescheduling
//
        cmpw    $1,_gint_count
        jne     restore_regs    // No
//      cmp     $0,_need_resched // Schedule needed ?
//      je      restore_regs    // No
//
// This path will return directly to user space
//
        sti                     // Enable interrupts to help fast devices
        call    schedule        // Task switch
        call    do_signal       // Check signals
        cli
//
//      Restore registers and return
//

Instead of comparing _gint_count to 1, just compare intr_count to 0. That will force the call to reschedule at the end of a non-nested interrupt, regardless of whether the system was in user or kernel mode.

Back to the sys86 discussion.

Using INT 80h instruction to enter the kernel ('system call') is actually a synchronous call, being interrupted is actually an asynchronous call. Hereafter, 'system call' and 'true interrupt' are used to distinguish the two cases.

Yes, nice distinguishment thinking of INT 80 (or any software interrupt) as synchronous versus being interrupted. However, as I will point out, it does not pay to think of them much differently from the point of view of saving registers and changing contexts. There are very real advantages and efficiencies gained when treating them exactly the same way on entry (and exit).

In order to explain this, lets consider again what the point of a system call and/or an interrupt could be. The system call structure's point is to provide services, more specifically provide services to a context. What is that context? In this case, it is a set of registers and a stack (otherwise known as a process). Thus, a system call provides the requested service to the current context. What about an interrupt? Under what context does it provide services? Well, in ELKS sometimes, and always in sys86, the interrupt "rides on top of" the interrupted context. It doesn't have a usable context like a system call does, and thus its ability to provide services is quite limited.

As I previously pointed out, the 8086 doesn't actually have user or kernel modes, so there's no hardware reason a user to kernel stack change is required, as is required in protected modes. But as you pointed out, the real reason for switching to a kernel stack is addressability - it is quite convenient to have SS==DS for the kernel to be written as a somewhat normal C program. So - having a partial context (of saved registers) on a user stack, but the kernel call stack and data segment on another SS/DS, isn't a good idea. Far better to the full context in the kernel stack, especially when it comes time for task switching. As will be seen, if a true interrupt can completely resemble a software interrupt, task switches can (possibly) be made extremely easily, and an interrupt "process/task" could also receive services as though it were an application "process/task", because it looks almost exactly the same to the kernel.

Now consider a hw interrupt: if the full context were saved on a kernel stack, we find ourselves in the same situation as with a system call/software interrupt - that is, the kernel has full context of the situation and could, in fact, execute the interrupt on behalf of the interrupted context, rather than "no" context. I hope you understand what I'm trying to say. In fact, this is why ELKS can switch tasks after a non-nested hardware interrupt (like the clock interrupt) - because the stack is arranged identically as though the application made a system call instead.

Now that we've seen some real benefits in treating software/system calls and hw interrupts identically, lets consider the case of the nested interrupt: as you know, both ELKS and sys86 (? I haven't checked lately) use a separate interrupt stack. The big problem with this is just like the problem with the current sys86 design - all of a sudden, instead of having interrupts and system calls being identical, the kernel finds itself running on a stack which isn't capable of providing the interrupted context any services; thus, the kernel must allow all returns from an interrupt stack untask-switched.

Because the true interrupt handler can make one kernel-mode task to be suspended and another user-mode task to be resumed (and vice-versa), a full context save & restore is needed, including SS and DS as user and kernel data segments are different. Therefore, task_switch also has to save the full caller context.

Actually, if the full context is saved as described above, the task switcher doesn't have to worry about saving any more than the stack pointer, and not the segment, since all kernel stacks have the same stack segment.

Moreover, the interrupt stack has to be empty before restoring the continued context, otherwise it would grow indefinitively. This is the main reason why the task switch is only performed on interrupt exit.

Actually, I'm not sure about the growing indefinitely. Yes - if using a global interrupt stack. But the real reason one can't task switch in ELKS is precisely because the saved context isn't in the right place (i.e. the tasks kernel stack). But if the interrupt stack were thrown out, and instead another (seperate) task-style stack were used, just like in the case of the first hw interrupt, but using a separate task-stack per interrupt number, then I think it possible for "interrrupt tasks" to gain a full context on their own, and effectively be able to be task-switched out themselves. In this scenario, there wouldn't be indefinite stack growth because the system is essentially switching to a "task stack" for each interrupt.

So all of this design really depends on what capabilities are really needed, of course. It seems the current restriction of sys86 not being able to task-switch until after a non-nested interrupt doesn't really buy much that ELKS doesn't have, except requiring lots more code and protection for a now-reentrant kernel. But if the idea of a "process" (a context and ability to effect system calls) were extended to a hardware interrupt (a task process and the same ability to effect system calls for itself), the system might be able to do more what it seems you're looking for.

In place of saving the registers to the caller stack and having to use a far pointer to get their values on system call, the current implementation can be modified to save them in the task descriptor, like in ELKS, to use a near pointer.

No pointer needed at all at all if registers saved on the kernel stack, as they're always at the same stack offset (of the per-task stack). See what I mean now?

But it would cost more than a simple INT80h / PUSHA / PUSH DS / PUSH ES, and as told in jbruchon/elks#729 (comment), it would consumes more kernel data space.

Yes, of course. But I think the amount of kernel data space used here is not important at all. Far more important is what is trying to be achieved.

Will benchmark the 'push context on kernel stack only' design option in #2.

I use that SBC (https://www.advantech.com/products/1-2jkp2y/snmp-1000-b/mod_b01abe8d-c8aa-4530-8e3c-9fcd2623d9d6) as an evaluation board

Interesting. Can that board be used with the CONFIG_ROMCODE option in ELKS? I see you've written your own basic flashing routine for sys86, is the flashing completely handled in the SBC by INT 0x61? I haven't looked at mon86 yet, I presume it loads sys86? How do you get that bootstrapped?

As on old standard PC, the SBC has a socket with a removable 512K EEPROM. So I first disassembled the first 32K of that EEPROM to understand how the HW is managed, then I disabled the BIOS checksum and patched with MON86, the whole with an EEPROM programmer.

MON86 is now used to upload and run SYS86 through the serial port, and to burn any change in the EEPROM through a flashing routine already available in the original BIOS though an INT 61h service.

ELKS with CONFIG_ROMxxx is not running on that SBC yet, but it does in EMU86, a tool I wrote to debug the 8086 code on my development PC before testing it on the SBC.

EMU86 has today two configurations, one to emulate the bare PC BIOS & HW to host ELKS, and another to emulate the SBC BIOS & HW to host SYS86. Reworking ELKS to move from the first to the second configuration is in my mind since 2015.

Getting the SBC figured out sounds like both quite a bit of work, as well as lots of fun!

I would like the ability to play with a ROM version of ELKS, but it sounds like I won't be able to bootstrap that SBC without an external EEPROM programmer. It is very nice how you have been able to get it to flash SYS86 through MON86, indeed.

As you may know, I have been deeply into ELKS setup.S lately, where I wrote a relocating loader for the ELKS image when a .fartext section is present. I've been pretty careful trying to keep the ROM routines unaffected, but haven't been able to test. In addition, there will be some problems should a ROM .fartext version be wanted - I'm not sure yet whether a ROM version can be generated that doesn't require text relocations or not, even at known load addresses, so a ROM kernel may also have to be relocated at boot with the new relocating loader as well. Frankly, I'd like to refactor both ROM and REL_SYS versions to be as identical as possible.

Thanks for mentioning EMU86. For some reason I was thinking it was a spin-off of the elksemu emulator, but looking at it briefly, I see it is indeed an 8086 emulator like tiny8086. You may not remember, but when I first ran into ELKS a few years ago, I mentioned tiny8086 to you as an extremely interesting approach to emulation. Are you aware how its design using a modified full bios.asm is used to emulate most aspects of a PC, but redirect console I/O back to the host OS? Pretty cool. I was thinking EMU86 could possibly use that approach also, rather than (re)writing all sorts of PC stuff in C.

It would be very interesting to be able to run an 8086 emulator with debugging capabilities, able to host ELKS. That's because I can't easily run elksemu since I'm running macOS.

Reworking ELKS to move from the first to the second configuration is in my mind since 2015.

What do you mean, exactly? Being able to host ELKS on top of MON86, on your SBC? Or hosting ELKS within EMU86, with MON86 debugging capabilities?

Yes, before starting EMU86 (see https://github.com/jbruchon/dev86/pull/1), I spent some time on tiny8086, but it quickly appeared that it would be difficult to implement some debugging facilities in its 'compact' code (see the history of tiny8086 for the explanation why it is so 'compact').

To test ELKS in ROM mode, just use the config-emu86 configuration, and emu86.sh to load and run in EMU86. You should have an SASH prompt after ELKS boot. You can setup a code / data breakpoint, then execute step by step the ELKS code with the -c / -d options, display the registers, the stack, trigger a timer interrupt, etc.

As stated in the EMU86 README, the goal is NOT to rewrite a complete PC emulator, but to help to understand what are the very minimal BIOS & HW features required by ELKS to run on the PC target, before porting it to the SBC target. This is the reason why it is part of the cross tools in ELKS.

Final objective is to burn ELKS kernel and the root ROMFS in the SBC flash, then to boot directly on ELKS, without any more help from MON86.

mfld-fr / sys86

Looking around... #1