rust-osdev / x86_64

Library to program x86_64 hardware.
https://docs.rs/x86_64
Apache License 2.0
797 stars 132 forks source link

Fixed overflow bug in PageRangeInclusive #351

Closed drzewiec closed 2 years ago

drzewiec commented 2 years ago

This PR is to implement the solution suggested by @phil-opp and @Freax13 in #346. Please let me know your thoughts.

One thing that maybe could use doing still: I wanted to add a test case to check this, but it looks like the testing code is largely copied from blog_os and maybe isn't used? I admit I am really not familiar with unit testing so I could use some help on writing a test if that is something you guys would like to see.

drzewiec commented 2 years ago

Another thing I was uncertain about: constructing the max address on every iteration is needlessly expensive. However, I would imagine that we would need to add a field to the struct in order to be able to construct it only once. I don't know if that is something you would like to see or not.

Freax13 commented 2 years ago

Thank you for your contribution!

One thing that maybe could use doing still: I wanted to add a test case to check this, but it looks like the testing code is largely copied from blog_os and maybe isn't used? I admit I am really not familiar with unit testing so I could use some help on writing a test if that is something you guys would like to see.

Yes, we should absolutely test that. Perhaps this example will help you write a test.

drzewiec commented 2 years ago

Ah, I see... I was looking at the testing/ directory to find tests, or a tests module, rather than in the individual source files. Looks fairly straightforward. Thank you!

Freax13 commented 2 years ago

Another thing I was uncertain about: constructing the max address on every iteration is needlessly expensive. However, I would imagine that we would need to add a field to the struct in order to be able to construct it only once. I don't know if that is something you would like to see or not.

Creating the max address is actually really fast. In optimized builds creating and comparing the address boils down to a single instruction:

 push    rax
 mov     rax, qword, ptr, [rsi]
 mov     rcx, qword, ptr, [rsi, +, 8]
 cmp     rax, rcx
 jbe     .LBB30_2
 xor     eax, eax
 mov     qword, ptr, [rdi], rax
 mov     rax, rdi
 pop     rcx
 ret
.LBB30_2:
 cmp     rax, -4096  # <-----  The address is directly compared against 0xffff_ffff_ffff_f000.
 jae     .LBB30_3
 lea     rcx, [rax, +, 4096]
 mov     rdx, rcx
 shr     rdx, 47
 jne     .LBB30_6
 jmp     .LBB30_9
.LBB30_3:
 sub     rcx, 4096
 jb      .LBB30_13
 add     rsi, 8
 mov     rdx, rcx
 shr     rdx, 47
 je      .LBB30_9
.LBB30_6:
 cmp     edx, 131071
 je      .LBB30_9
 cmp     edx, 1
 jne     .LBB30_12
 shl     rcx, 16
 sar     rcx, 16
.LBB30_9:
 and     rcx, -4096
 mov     qword, ptr, [rsi], rcx
 mov     qword, ptr, [rdi, +, 8], rax
 mov     eax, 1
 mov     qword, ptr, [rdi], rax
 mov     rax, rdi
 pop     rcx
 ret
.LBB30_13:
 lea     rdi, [rip, +, .L__unnamed_40]
 lea     rdx, [rip, +, .L__unnamed_41]
 mov     esi, 43
 call    qword, ptr, [rip, +, _ZN4core9panicking5panic17h81c3143387dc6c76E@GOTPCREL]
 ud2
.LBB30_12:
 mov     qword, ptr, [rsp], rcx
 lea     rdi, [rip, +, .L__unnamed_42]
 lea     rcx, [rip, +, .L__unnamed_43]
 lea     r8, [rip, +, .L__unnamed_44]
 mov     rdx, rsp
 mov     esi, 74
 call    qword, ptr, [rip, +, _ZN4core6result13unwrap_failed17hedb62ff4b6c4ec9fE@GOTPCREL]
 ud2

Admittedly the generated code is far from perfect, I believe at least one of the panicking branches is actually unreachable, but creating the maximum page isn't the problem here.

Freax13 commented 2 years ago

Thank you for your contribution!