mikaku / Fiwix

A UNIX-like kernel for the i386 architecture
https://www.fiwix.org
Other
401 stars 32 forks source link

Support building with tcc compiler #63

Closed rick-masters closed 7 months ago

rick-masters commented 7 months ago

The live-bootstrap project must compile Fiwix with tcc because gcc is not available until much later. Note that the tcc used to build Fiwix must be patched to handle the physical / virtual addresss scheme used by Fiwix.
With gcc, the address scheme is handled by a linker script but tcc does not support linker scripts.

In the forthcoming PR, documentation is provided in docs/tcc.txt which explains where to get tcc, how to patch it, and how to build Fiwix.

The following is an explanation of the various changes to support tcc. Some of these changes are significant and so I am open to discussing better alternatives.

Makefile:

drivers/block/ata.c:

fiwix.ld:

include/fiwix/config.h:

kernel/boot.S:

kernel/init.c:

lib/printk.c

mm/memory.c

mikaku commented 7 months ago

Wow!, I'm a bit scared for this amount of changes in the core. This will require a lot of testing.

Some questions:

  1. I see you removed the kernel stack lines in the linker script, but I don't see where you defined the new kernel stack location. I expected to see it in boot.S but it's not there.
  2. In asm.h, why you need to move the arguments of USER_SYSCALL in the order %eax, %ecx, %edx and %ebx, instead of using the natural order? Is this something related the way tcc works?
  3. In _start (in boot.S) I want to disable interrupts (cli) just from the beginning, but it seems to me that it won't happen for tcc compilations. I think that cli should be placed right above of the line #ifdef __TINYC__.
  4. You say SAVE_ALL - preserve ebx but I don't see any change affecting the %ebx register in SAVE_ALL macro.
  5. I'm surprised with the change in do_switch. If %ebx was been clobbered until now, shouldn't this create a more visible malfunction?
  6. Where is memmove() used? (also, this type of functions are defined in lib/strings.c).

Changes:

  1. I'd change all the new lines with ((unsigned int)_end & 0xFFFFF000) by ((unsigned int)_end & PAGE_MASK), just for clarify.
rick-masters commented 7 months ago

Wow!, I'm a bit scared for this amount of changes in the core. This will require a lot of testing.

Thankfully, we're in the final stretch. The only change remaining after this one is kexec for linux!

Some questions:

  1. I see you removed the kernel stack lines in the linker script, but I don't see where you defined the new kernel stack location. I expected to see it in boot.S but it's not there.

The kernel stack is left where it was set at the start of the kernel. https://github.com/rick-masters/Fiwix/blob/40ef6e832a394a48a89d23377fc7db0889201678/kernel/boot.S#L115

I didn't see a reason to relocate the stack.

  1. In asm.h, why you need to move the arguments of USER_SYSCALL in the order %eax, %ecx, %edx and %ebx, instead of using the natural order? Is this something related the way tcc works?

Yes, tcc has a strange behavior of moving arguments into registers automatically, and in a strange order.

Consider the following syscall:

        USER_SYSCALL(SYS_open, "/dev/console", O_RDWR, 0);      /* stdin */

Here is how tcc compiles this:

   a:   b8 05 00 00 00          mov    $0x5,%eax
   f:   b9 02 00 00 00          mov    $0x2,%ecx
  14:   ba 00 00 00 00          mov    $0x0,%edx
  19:   bb 00 00 00 00          mov    $0x0,%ebx
  1e:   89 c0                   mov    %eax,%eax
  20:   89 c9                   mov    %ecx,%ecx
  22:   89 d2                   mov    %edx,%edx
  24:   89 db                   mov    %ebx,%ebx
  26:   cd 80                   int    $0x80

(Note the pointer to "/dev/console" appears as a zero in the instruction mov $0x0, %ebx. The zero is replaced later during linking stage.)

So, %0 is eax, %1 is ecx, %2 is edx, and %3 is ebx. To match gcc and to be explicit I then move the %0, %1, %2, %3 arguments into the appropriate named registers. However, I can see how that does not make it easier to understand. I think a comment might be more appropriate than inserting code which is essentially redundant. The comment can explain that tcc moves the arguments into registers automatically and what order it uses.

I've changed the macro to the following:

#ifdef __TINYC__
/* tcc loads "r" (register) arguments automatically into registers using this order:
 * eax, ecx, edx, ebx
 * Therefore, we rearrange the arguments so they go into the correct registers.
 */
#define USER_SYSCALL(num, arg1, arg2, arg3)    \
        __asm__ __volatile__(                   \
                "int    $0x80\n\t"              \
                : /* no output */               \
                : "r"((unsigned int)num), "r"((unsigned int)arg2), "r"((unsigned int)arg3), "r"((unsigned int)arg1)     \
        );
#else
#define USER_SYSCALL(num, arg1, arg2, arg3)     \
        __asm__ __volatile__(                   \
                "movl   %0, %%eax\n\t"          \
                "movl   %1, %%ebx\n\t"          \
                "movl   %2, %%ecx\n\t"          \
                "movl   %3, %%edx\n\t"          \
                "int    $0x80\n\t"              \
                : /* no output */               \
                : "eax"((unsigned int)num), "ebx"((unsigned int)arg1), "ecx"((unsigned int)arg2), "edx"((unsigned int)arg3)     \
        );
#endif
  1. In _start (in boot.S) I want to disable interrupts (cli) just from the beginning, but it seems to me that it won't happen for tcc compilations. I think that cli should be placed right above of the line #ifdef __TINYC__.

I have made this change.

  1. You say _SAVEALL - preserve ebx but I don't see any change affecting the %ebx register in SAVE_ALL macro.

Sorry, this was a mistake. The change in SAVE_ALL was replacing pushal/popal with pusha/popa.

  1. I'm surprised with the change in do_switch. If %ebx was been clobbered until now, shouldn't this create a more visible malfunction?

I was surprised by this as well. I had to look at the assembly to understand why tcc was having a problem but gcc was not. It turns out that gcc was either not using ebx or using it in a way that avoided problems but tcc uses ebx more often.

  1. Where is memmove() used? (also, this type of functions are defined in lib/strings.c).

tcc uses memmove to copy structures. Wherever a structure is assigned to another structure tcc inserts a call to memmove. Normally, memmove is included in the executable by linking with libtcc. However, we compile with -nostdlib -nostdinc which excludes libtcc.

I have moved memmove to lib/strings.c

Changes:

  1. I'd change all the new lines with ((unsigned int)_end & 0xFFFFF000) by ((unsigned int)_end & PAGE_MASK), just for clarify.

I have made this change.

mikaku commented 7 months ago

Thankfully, we're in the final stretch. The only change remaining after this one is kexec for linux!

Nice, it has been a long road. You did a lot of changes to fit Fiwix into the live-bootstrap project. Congratulations.

The kernel stack is left where it was set at the start of the kernel. https://github.com/rick-masters/Fiwix/blob/40ef6e832a394a48a89d23377fc7db0889201678/kernel/boot.S#L115

I didn't see a reason to relocate the stack.

Ah yes, I missed that, sorry. Now I thought, shouldn't be better to point kernel stack at 0x10000-4 instead of 0x10000? I mean, just to make sure that it will reside in the page between 0xF000 and 0xFFFF. What do you think?.

Yes, tcc has a strange behavior of moving arguments into registers automatically, and in a strange order. I think a comment might be more appropriate than inserting code which is essentially redundant.

Yes, the comment will clear up things. Thanks.

I was surprised by this as well. I had to look at the assembly to understand why tcc was having a problem but gcc was not. It turns out that gcc was either not using ebx or using it in a way that avoided problems but tcc uses ebx more often.

Somehow this change is good and sanitizes the code.

rick-masters commented 7 months ago

Ah yes, I missed that, sorry. Now I thought, shouldn't be better to point kernel stack at 0x10000-4 instead of 0x10000? I mean, just to make sure that it will reside in the page between 0xF000 and 0xFFFF. What do you think?.

This shouldn't be necessary but there is no harm in doing so.

You can see from this pseudocode that the stack register is decremented before the value is stored in memory: https://c9x.me/x86/html/file_module_x86_id_269.html

So, the first push will already go into 0xFFFC.

mikaku commented 7 months ago

You can see from this pseudocode that the stack register is decremented before the value is stored in memory:

This explains it is not necessary, indeed.

mikaku commented 7 months ago

Thank you very much.

mikaku commented 7 months ago

I'm reviewing the code and I think that when Fiwix is built by the tcc compiler, it will show something like this:

[...]
             (built on Sat Jan  6 17:10:24 UTC 2024 with GCC tcc)
[...]

This is because the following line: https://github.com/mikaku/Fiwix/blob/00974ef61ad7004d608f9ff6d252db3417f4c1fc/kernel/main.c#L82

is the same on each compiler, it just changes the constant __VERSION__.

I think that a best approach would be:

  1. Remove the following line in the Makefile: https://github.com/mikaku/Fiwix/blob/00974ef61ad7004d608f9ff6d252db3417f4c1fc/Makefile#L30
  2. Add and #ifdef in main.c and change the printf line by this block:
    #ifdef __TINYC__
        printk("             (built on %s with tcc)\n", UTS_VERSION);
    #else
        printk("             (built on %s with GCC %s)\n", UTS_VERSION, __VERSION__);
    #endif 

Thoughts?

With this change we don't need to regenerate a new 1.5.0-lb1 version, it will just appear in the next 1.5.0-lb2 version.

rick-masters commented 7 months ago

Yes, your suggested change looks better to me.

mikaku commented 7 months ago

I've checked that this change did not make any difference when compiling Fiwix using GCC. Please, check if all is also correct when using TCC.

rick-masters commented 7 months ago

It works fine with tcc.

mikaku commented 7 months ago

Perfect! Thank you very much.