0xAX / linux-insides

A little bit about a linux kernel
http://0xax.gitbooks.io/linux-insides/content/index.html
Other
29.86k stars 3.35k forks source link

Why disk image needs to have origin set to 0x7c00? #498

Closed slo closed 6 years ago

slo commented 7 years ago

Regarding sentence after ASM code for MBR: "the origin is set to 0x7c00 and we end with the magic sequence". I couldn't find any indication why boot binary should have org set to 0x7c00. I've only found information about BIOS, which copies MBR under this address. But I don't understand why image should have this origin. From my tests this directive seems to be irrelevant to this image, because qemu uses disk image, which doesn't have any origin. I've also compared both images(wtih and without directive org 0x7c00), and they are the same.

aniket-deole commented 6 years ago

https://www.glamenv-septzen.net/en/view/6 might clear your doubt?

slo commented 6 years ago

Maybe I was not clear. My question was only pointing out, that directive: [ORG 0x7c00] is irrelevant in context of binary output from masm (and during my investigation whether nasm source contained it or not, it produced the same binary output). So my conclusion is that [ORG 0x7c00] in disk image is not needed because it comes from running software, rather than device. And based on that, should be removed from disk source in article.

0xAX commented 6 years ago

So my conclusion is that [ORG 0x7c00] in disk image is not needed because it comes from running software, rather than device.

Ah yes, now it is clear. You're absolutelly right. I will update the part with a 'bootloader' code.

0xAX commented 6 years ago

here is it https://github.com/0xAX/linux-insides/commit/abf4f684a5e20033bcf657babba1c77025292227. Thank you @slo 👍

mpetch commented 4 months ago

ORG 0x7c00 is generally needed (with binary file output), but your code example is too simple to manifest the problem. In the absence of an ORG (same as ORG 0x0000) It is true that QEMU and BIOSes will load the 512 byte bootloader at physical address 0x07c00, but the assembler has no knowledge of this. In your code example the instructions are all position independent. You can run them from anywhere in memory and it will still work. Because of this fact you can remove the org 0x7c00 and generate the identical code.

Now what happens if you want to reference data in the the bootloader. Let us say rather than coding the ! as part of a mov instruction we place the ! elsewhere in the bootloader and load it from memory. Something like this:

;
; Note: this example is written in Intel Assembly syntax
;
[BITS 16]

boot:
    xor ax, ax                 ; AX = 0
    mov ds, ax                 ; Initialize DS=0. BIOS may not have set it to 0

    mov al, [charmem]          ; [charmem] is relative to DS, same as saying
                               ;     ds:[charmem] which is why DS must be set.
    mov ah, 0x0e
    mov bh, 0x00
    mov bl, 0x07

    int 0x10
    jmp $

charmem: db '!'

times 510-($-$$) db 0

db 0x55
db 0xaa

Try running that in QEMU/BOCHs (or real hardware) it in all likelihood won't print the !. Now add the ORG 0x7c00 and try it again. It will work. What is going on, why does the absence of ORG (or an ORG 0x0000) fail? The assembler has to resolve the memory location of variable charmem. Since the origin point is 0 all memory references are assumed to be relative to 0 too. The problem is that our code will be placed into memory at 0x07c00, not at 0x00000!

If we review the generated code we see this:

objdump --adjust-vma=0x7c00 -D -b binary -mi8086 -Mintel boot

00007c00 <.data>:
    7c00:       31 c0                   xor    ax,ax
    7c02:       8e d8                   mov    ds,ax
    7c04:       a0 11 00                mov    al,ds:0x11
    7c07:       b4 0e                   mov    ah,0xe
    7c09:       b7 00                   mov    bh,0x0
    7c0b:       b3 07                   mov    bl,0x7
    7c0d:       cd 10                   int    0x10
    7c0f:       eb fe                   jmp    0x7c0f
    7c11:       21 00                   and    WORD PTR [bx+si],ax
        ...

The value 21 (0x21) after the JMP is our exclamation mark ! at offset 0x11. Look at the MOV instruction that was generated. mov al,ds:0x11 . DS is set to 0 so that is the equivalent of memory address 0x0000:0x0011. It should be obvious that is in the interrupt vector table at the bottom of memory. That is not where our ! is. Our exclamation mark is actually at 0x7c11. Now modify the code adding the ORG 0x7c00 directive and you get:

objdump --adjust-vma=0x7c00 -D -b binary -mi8086 -Mintel boot

00007c00 <.data>:
    7c00:       31 c0                   xor    ax,ax
    7c02:       8e d8                   mov    ds,ax
    7c04:       a0 11 7c                mov    al,ds:0x7c11
    7c07:       b4 0e                   mov    ah,0xe
    7c09:       b7 00                   mov    bh,0x0
    7c0b:       b3 07                   mov    bl,0x7
    7c0d:       cd 10                   int    0x10
    7c0f:       eb fe                   jmp    0x7c0f
    7c11:       21 00                   and    WORD PTR [bx+si],ax
        ...

By changing the ORG directive from a default of 0x0000 to 0x7C00 we have informed the NASM assembler that memory references are relative to offset 0x7c00, not 0x0000. When -f bin is used with NASM it has no idea what the origin point is. The ORG directive tell NASM just that. The result is that the MOV now appears as mov al,ds:0x7c11 . Rather than offset 0x11, it is now 0x7c11.


A more useful example might be one of printing a string with int 0x10 one character at a time. This would require the ORG and the segment registers to be set accordingly (usually ORG 0x7c00 and set the needed segment registers to 0 at runtime):

;
; Note: this example is written in Intel Assembly syntax
;
[BITS 16]
[ORG 0x7c00]

boot:
    xor ax, ax                 ; AX = 0
    mov ds, ax                 ; Initialize DS=0. BIOS may not have set it to 0
    cld                        ; Set the Direction flag (DF) forward since we use LODSB.

    mov si, bootstr
    call print_string_rm       ; Print the bootsr string.
    jmp $

; Function: print_string_rm
;           Display a string to the console on display page 0
;
; Inputs:   SI = Offset of address to print
; Clobbers: AX, BX, SI

print_string_rm:
    mov ah, 0x0e               ; BIOS tty Print
    xor bx, bx                 ; Set display page to 0 (BL)
    jmp .getch
.repeat:
    int 0x10                   ; print character
.getch:
    lodsb                      ; Get character from string
    test al,al                 ; Have we reached end of string?
    jnz .repeat                ;     if not process next character
.end:
    ret

bootstr: db 'Welcome to my Bootloader!', 0

times 510-($-$$) db 0

db 0x55
db 0xaa

Notes

luohuang commented 4 months ago

这是来自QQ邮箱的自动回复邮件。   您好,您的邮件我已经收到。   谢谢!

pooranjoyb commented 1 month ago

One of the best explanations I've seen in a while! @mpetch