PiJoules commented 4 years ago

<Heads up>Sorry for the block of text and pictures. This has just been bugging me for a bit and I wanted to provide as much context as I could.</Heads up>

When playing around with 15-video-ports I noticed that the default kernel code would actually print 'X' in the middle of the screen and not the top left corner as I would expect. I thought this was weird and started playing around with that code and discovered more weird things regarding printing.

I couldn't print multiple characters sequentially nor change their colors. For example, if I remove offset_from_vga in the subscript and just had int literals, I could always print a character at the top left corner of the screen on QEMU (address 0xb8000), but I could never change the color nor print a character immediately after.

For reference, here's me attempting to print a blue 'B', and green 'C' in the top left corner with:

char *vga = 0xb8000;
vga[0] = 'B';
vga[1] = 0x0D;  // light magenta on black
vga[2] = 'C';
vga[3] = 0x0a;  // light green on black

Error Screenshot 2019-12-28 at 15 52 21

BUT I can get my intended results if I instead explicitly assign those values to the video memory addresses:

*(char *)(0xb8000) = 'B';
*(char *)(0xb8001) = 0x09;
*(char *)(0xb8002) = 'C';
*(char *)(0xb8003) = 0x0a;

Expected Screenshot 2019-12-28 at 15 53 14

I dug deeper into this and looked at the assembly to see if perhaps gcc somehow emitted some bad instructions, but as far as I can tell, the assembly looks clean.

Broken example:

    char *vga = 0xb8000;
  52:   48 c7 45 f0 00 80 0b    movq   $0xb8000,-0x10(%rbp)
  59:   00  
os-tutorial/15-video-ports/kernel/kernel.c:31
    // This doesn't work as intended
    vga[0] = 'B';
  5a:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  5e:   c6 00 42                movb   $0x42,(%rax)          # B is successfully printed
os-tutorial/15-video-ports/kernel/kernel.c:32
    vga[1] = 0x0D;  // light magenta on black
  61:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  65:   48 83 c0 01             add    $0x1,%rax             # This should be 0xb8001
  69:   c6 00 0d                movb   $0xd,(%rax)
os-tutorial/15-video-ports/kernel/kernel.c:33
    vga[2] = 'C';
  6c:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  70:   48 83 c0 02             add    $0x2,%rax             # This should be 0xb8002
  74:   c6 00 43                movb   $0x43,(%rax)
os-tutorial/15-video-ports/kernel/kernel.c:34
    vga[3] = 0x0a;  // light green on black
  77:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  7b:   48 83 c0 03             add    $0x3,%rax             # This should be 0xb8003
  7f:   c6 00 0a                movb   $0xa,(%rax)

Working example for reference:

    *(char *)(0xb8000) = 'B';
  59:   b8 00 80 0b 00          mov    $0xb8000,%eax
  5e:   c6 00 42                movb   $0x42,(%rax)
os-tutorial/15-video-ports/kernel/kernel.c:40
    *(char *)(0xb8001) = 0x09;
  61:   b8 01 80 0b 00          mov    $0xb8001,%eax
  66:   c6 00 09                movb   $0x9,(%rax)
os-tutorial/15-video-ports/kernel/kernel.c:41
    *(char *)(0xb8002) = 'C';
  69:   b8 02 80 0b 00          mov    $0xb8002,%eax
  6e:   c6 00 43                movb   $0x43,(%rax)
os-tutorial/15-video-ports/kernel/kernel.c:42
    *(char *)(0xb8003) = 0x0a;
  71:   b8 03 80 0b 00          mov    $0xb8003,%eax
  76:   c6 00 0a                movb   $0xa,(%rax)

Even more confused, I looked further with gdb (make gdb).

Screenshot 2019-12-28 at 16 17 06

And this is the strange part:

Screenshot 2019-12-28 at 16 18 42

SOMEHOW on the set of instructions when I attempt to access vga[1], rax decrements from 0xb8000 to 0xb7fff and I don't know why. The corresponding assembly for this is:

  61:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  65:   48 83 c0 01             add    $0x1,%rax             # This should be 0xb8001
  69:   c6 00 0d                movb   $0xd,(%rax)

So, to actually ask my question now, have I been doing something wrong this whole time, or is this a bug in QEMU, or am I misunderstanding something and this is just "working as intended"?

Stuff about my environment/other stuff I tried

This may or may not be a big deal, but instead of cross compiling to i386, I'm instead targeting x86_64 so I didn't have to make an i386 gcc from scratch in an earlier step. I was able to get all other examples working with this setup until now. I also haven't tried making the i386 cross-compiler and rerunning this example, but even if that's the "intended way" of running this example, it would still be nice if someone could offer insight on why I'm running into this issue for my 64bit case.
I'm using GCC v8.3.0 for compiling.
QEMU version 4.1.0 which seems to be one of the newer versions, but I'm also able to reproduce this with QEMU 2.12.1.

Diff for reproducing

People who want to reproduce this should be able to just git apply this diff to the repo:

diff --git a/15-video-ports/Makefile b/15-video-ports/Makefile
index b611257..64be05d 100644
--- a/15-video-ports/Makefile
+++ b/15-video-ports/Makefile
@@ -4,11 +4,16 @@ HEADERS = $(wildcard kernel/*.h drivers/*.h)
 OBJ = ${C_SOURCES:.c=.o}

 # Change this if your cross-compiler is somewhere else
-CC = /usr/local/i386elfgcc/bin/i386-elf-gcc
-GDB = /usr/local/i386elfgcc/bin/i386-elf-gdb
+CC = gcc -march=x86-64
+GDB = gdb
+LD = ld -A x86-64
+QEMU = qemu-system-x86_64
+ELF = elf64
 # -g: Use debugging symbols in gcc
 CFLAGS = -g

+all: os-image.bin
+
 # First rule is run by default
 os-image.bin: boot/bootsect.bin kernel.bin
    cat $^ > os-image.bin
@@ -16,18 +21,18 @@ os-image.bin: boot/bootsect.bin kernel.bin
 # '--oformat binary' deletes all symbols as a collateral, so we don't need
 # to 'strip' them manually on this case
 kernel.bin: boot/kernel_entry.o ${OBJ}
-   i386-elf-ld -o $@ -Ttext 0x1000 $^ --oformat binary
+   ${LD} -o $@ -Ttext 0x1000 $^ --oformat binary

 # Used for debugging purposes
 kernel.elf: boot/kernel_entry.o ${OBJ}
-   i386-elf-ld -o $@ -Ttext 0x1000 $^ 
+   ${LD} -o $@ -Ttext 0x1000 $^

 run: os-image.bin
-   qemu-system-i386 -fda os-image.bin
+   ${QEMU} -fda os-image.bin

 # Open the connection to qemu and load our kernel-object file with symbols
 debug: os-image.bin kernel.elf
-   qemu-system-i386 -s -fda os-image.bin &
+   ${QEMU} -s -fda os-image.bin &
    ${GDB} -ex "target remote localhost:1234" -ex "symbol-file kernel.elf"

 # Generic rules for wildcards
@@ -36,7 +41,7 @@ debug: os-image.bin kernel.elf
    ${CC} ${CFLAGS} -ffreestanding -c $< -o $@

 %.o: %.asm
-   nasm $< -f elf -o $@
+   nasm $< -f ${ELF} -o $@

 %.bin: %.asm
    nasm $< -f bin -o $@
diff --git a/15-video-ports/kernel/kernel.c b/15-video-ports/kernel/kernel.c
index dcc2d9f..589285d 100644
--- a/15-video-ports/kernel/kernel.c
+++ b/15-video-ports/kernel/kernel.c
@@ -27,6 +27,15 @@ void main() {
     /* Let's write on the current cursor position, we already know how
      * to do that */
     char *vga = 0xb8000;
-    vga[offset_from_vga] = 'X'; 
-    vga[offset_from_vga+1] = 0x0f; /* White text on black background */
+    // This doesn't work as intended
+    vga[0] = 'B';
+    vga[1] = 0x0D;  // light magenta on black
+    vga[2] = 'C';
+    vga[3] = 0x0a;  // light green on black
+
+    // But this does
+    //*(char *)(0xb8000) = 'B';
+    //*(char *)(0xb8001) = 0x09;
+    //*(char *)(0xb8002) = 'C';
+    //*(char *)(0xb8003) = 0x0a;
 }

Menotdan commented 4 years ago

What does this do?

mov    -0x10(%rbp),%rax

Is rbp where the 0xb8000 gets stored? Try running x/10x rbp-10 (something like that) And that should show you the memory that assembly is trying to get

PiJoules commented 4 years ago

-0x10(%rbp) is where 0xb8000 gets stored. The line for char *vga = 0xb8000 get's expanded to:

    char *vga = 0xb8000;
  52:   48 c7 45 f0 00 80 0b    movq   $0xb8000,-0x10(%rbp)

before any of the accesses to vga. Also able to confirm this manually with gdb that -0x10(%rbp) is 0xb8000 after that line is executed.

Menotdan commented 4 years ago

Can you try using pointers? Like what happens if you do

char *vga = 0xb8000;
*(vga + 1)  = // data

PiJoules commented 4 years ago

Using that method seems to produce the same exact assembly as with my initial broken example:

    // This doesn't work as intended
    *(vga + 0) = 'B';
  5a:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  5e:   c6 00 42                movb   $0x42,(%rax)
/usr/local/google/home/leonardchan/projects/os-tutorial/15-video-ports/kernel/kernel.c:32
    *(vga + 1) = 0x0D;  // light magenta on black
  61:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  65:   48 83 c0 01             add    $0x1,%rax
  69:   c6 00 0d                movb   $0xd,(%rax)
/usr/local/google/home/leonardchan/projects/os-tutorial/15-video-ports/kernel/kernel.c:33
    *(vga + 2) = 'C';
  6c:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  70:   48 83 c0 02             add    $0x2,%rax
  74:   c6 00 43                movb   $0x43,(%rax)
/usr/local/google/home/leonardchan/projects/os-tutorial/15-video-ports/kernel/kernel.c:34
    *(vga + 3) = 0x0a;  // light green on black
  77:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  7b:   48 83 c0 03             add    $0x3,%rax
  7f:   c6 00 0a                movb   $0xa,(%rax)

And gives the same results as before with rax being set to 0xb7fff.

Menotdan commented 4 years ago

Did you remove the .elf file etc. and rebuild? That's weird

Menotdan commented 4 years ago

And doing the pointers directly works fine?

PiJoules commented 4 years ago

Did you remove the .elf file etc. and rebuild? That's weird

Yup, can still reproduce on a clean build.

And doing the pointers directly works fine?

Also yes. It only seems to be right before executing the add instructions in the vga[1]/*(vga + 1) cases where rax becomes 0xb7fff

Menotdan commented 4 years ago

Can I have a binary?

PiJoules commented 4 years ago

Yup. I attached the object file, elf file for gdb-debugging, and final iso. Thanks for helping also!

binaries.zip

Paolo309 commented 3 years ago

I'm having the exact same problem. @PiJoules, did you find out what is going on?

algorithmx51 commented 3 years ago

have you tried to write to vga as a short eg short *VGAMEMORY =(short *) 0xb8000; VGAMEMORY[0] = 0x0d42; // B VGAMEMORY[1] = 0x0a43; // C

Paolo309 commented 3 years ago

Actually I have even tried to do it in assembly. Using rdi to store the pointer works just fine

movq $0xb8000, %rdi
movw $0x0f61, (%rdi)
addq $0x02, %rdi
movw $0x0f62, (%rdi)

This prints a and b white on black (0x0f). But if I simply change rdi with rax, since rax gets decremented after the first write, the second word (0x0f62) gets written in 0xb8001 instead of 0xb8002 screwing the formatting of the first char and not printing the second one. If I write it in C++ the compiler uses rax as pointer rather than rdi and increments rax by two.

algorithmx51 commented 3 years ago

@Paolo309 in your example if you use any register (rax, rbx, rcx, etc.), you will get the same result (A and B on the screen). Also, try to push the register you will use so you don't cause problems.

Paolo309 commented 3 years ago

if you use any register (rax, rbx, rcx, etc.), you will get the same result (A and B on the screen).

That's what I thought too. The code I've written above is just an extract of my function. The whole code (like the one generated by gcc) is the following

functiontest:
    pushq %rbp
    movq %rsp, %rbp
    subq $0x10, %rsp

    movq $0xb8000, -0x8(%rbp)

    movq -0x8(%rbp), %rax
    movw $0x0f61, (%rax)
    addq $0x2, %rax
    movw $0x0f62, (%rax)

    leaveq
    retq

so I think I'm doing everything safely, or am I not? I've just tried with rbx, rcx and rdx, and it works correctly like with rdi. I get problems only using rax. Since it gets decremented, if I do this

[...]
    movq -0x8(%rbp), %rax
    movw $0x0f61, (%rax)   # rax -= 1     ¯\(°_o)/¯
    addq $0x3, %rax        # rax += 3
    movw $0x0f62, (%rax)
[...]

it works correctly.

algorithmx51 commented 3 years ago

Because i cannot debug your elf (gdb doesnt work i dont know why), could you please break the execution at 1.1061 (mov rax,QWORD PTR [rbp-0x10]) and 2.1073 and 1.check the data in rbp-0x10, 2.dump the registers and check again rbp-0x10?

Paolo309 commented 3 years ago

Ok, I couldn't debug using the elf format so I used the raw binary. I boot qemu with qemu-system-x86_64 and in gdb I set the architecture to i386:x86-64:intel, though not setting the architecture makes no difference. The weird thing is that it looks like the instructions are not properly fetched and they are often executed in the 8 byte variant and then 4 byte variant. The picture below is the function disassembled (with some coloring for ease) Screenshot (218)_LI And this is what happens Screenshot (221)_LI After instruction at rip=0x7e34 the next instruction should be at rip=7e3c, but instead the rip gets incremented by only one, executing the instruction at rip=0x7e35, which has the same effect of the preceding one, so no actual damage, though that should not happen. Same thing happens with the next instruction. So I thought it could be a problem with disassembling and/or debugging (if I disassemble wtih gdb at runtime the instructions are correct). When it comes to the addiction at rip=0x7e45, this results in some sort of dec, and then at rip=0x7e46 the actual addiction gets done and rax is incremented by two.

algorithmx51 commented 3 years ago

May i have the source code (omg the craziest bug i have ever seen) Also, is this running in long mode?

algorithmx51 commented 3 years ago

I tested your assembly in my os and it worked without any problems Screenshot from 2020-10-01 18-59-28

Paolo309 commented 3 years ago

Ok, good, that means that maybe I'm doing something wrong with compiling the code and/or launching qemu. Here's my code anyway, in all it's uglyness (github likes it .txt). boot.txt I compile it simply with

as -o boot.o boot.s
ld -o boot.bin --oformat binary -Ttext 0x7C00 boot.o

because I removed the C/C++ to keep it simple, but the results are the same. I'm not using a cross compiler, which may be the problem, but why? Here I've got plain binary with instructions for x86_64, shouldn't it still work if compiled like this? I launch qemu like this

qemu-system-x86_64 -fda boot.bin -m 2G

algorithmx51 commented 3 years ago

https://wiki.osdev.org/Why_do_I_need_a_Cross_Compiler%3F

Paolo309 commented 3 years ago

Well, I presumptuously assumed that I could have done it without a cross compiler. Tomorrow I'll try. Thank you!

Paolo309 commented 3 years ago

Nice, solved! But actually the cross compiler was not the problem (of course it's necessary for more complex stuff, but here I have just a few instructions all inside the boot sector, so nothing that as is not able to do, I suppose). I tried with the cross compiler but I got the same binary and the same weird behaviour. I then rewrote the code again to be sure I didn't mess up somewhere, and it turned out I did: I just forgot to clear the D/B bit and set the L bit in the code entry of the gdt. (╯°□°）╯︵ ┻┻ This is the new code (reduced to the essentials), and it appears to work, for now, both with cross and regular compiler and linker. I'm still not sure I've done everything right (setting the tables bits reading intel's guide feels like sorcery). I post it here since someone might find it helpful, though it's just a begginer's attempt so it should be taken with a grain of salt. boot.txt

.set PAGE_SIZE,         0x1000
.set PML4T_POINTER,     0x2000
.set PDPT_POINTER,      0x3000
.set PDT_POINTER,       0x4000

.code16
_start:
        # enable the A20 line
        movw $0x2401, %ax
        int $0x15

        # enable VGA mode 3
        movw $0x03, %ax
        int $0x10

        cli
        lgdt gdt_pointer
        movl %cr0, %eax
        orl $0x01, %eax
        movl %eax, %cr0

        jmp $0x08, $cont

gdt_start:
        .quad 0x00
gdt_code:
        .long 0x0000ffff
        .long 0x00cf9a00
gdt_data:
        .long 0x0000ffff
        .long 0x00cf9200
gdt_end:
gdt_pointer:
        .short gdt_end - gdt_start
        .long gdt_start
        .long 0x00
#.set CODE_SEG, gdt_code - gdt_start # 0x08
#.set DATA_SEG, gdt_data - gdt_start # 0x10

.code32
cont:
        movw $0x10, %ax
        movw %ax, %ds
        movw %ax, %es
        movw %ax, %fs
        movw %ax, %gs
        movw %ax, %ss

        ########## first test WORKING ##########
        movl $0xb8000, %eax
        movw $0x0f61, (%eax)
        addl $2, %eax
        movw $0x0f62, (%eax)

        # disabling protection
        movl %cr0, %eax
        andl $(~(1 << 31)), %eax
        movl %eax, %cr0

        # creating 4-level-paging tables for first 2 GiB of RAM
        movl $PML4T_POINTER, %edi
        movl %edi, %cr3
        xor %eax, %eax
        movl $PAGE_SIZE, %ecx
        rep stosl

        mov %cr3, %edi

        # only first entry in lvl 4 table
        movl $(PDPT_POINTER + 0x03), (%edi)
        movl $0x00, 4(%edi)
        addl $PAGE_SIZE, %edi

        # first two entries in lvl 3 table (first 2 GiB)
        # entries configuration (ugly but short and lazy, only for the example):
        #  bit[4] PCD=1, bit[3] PWT=1, bit[2] P/S=1, bit[1] R/W=1, bit[0] P=1
        movl $0x009f, 0x00(%edi)
        movl $0x0000, 0x04(%edi)
        movl $0x009f, 0x08(%edi)
        movl $0x0000, 0x0c(%edi)

        # enabling 4-level-paging
        movl %cr4, %eax
        orl $(1 << 5), %eax
        movl %eax, %cr4

        movl $0xC0000080, %ecx
        rdmsr
        orl $(1 << 8), %eax
        wrmsr

        movl %cr0, %eax
        orl $(1 << 31), %eax
        movl %eax, %cr0

        movl gdt_code+4, %eax
        andl $(~(1 << 22)), %eax # clearing bit D/B
        orl $(1 << 21), %eax     # setting bit L
        movl %eax, gdt_code+4

        jmp $0x08, $cont2

.code64
cont2:

        ########## second test WORKING ##########
        movq $0xb8004, %rax
        movw $0x0f63, (%rax)
        addq $2, %rax
        movw $0x0f64, (%rax)

        hlt

.fill 510 - (. - _start), 1, 0
.short 0xaa55

cfenollosa / os-tutorial

QEMU seems to decrement the vga pointer when printing characters #136

Stuff about my environment/other stuff I tried

Diff for reproducing