SerenityOS / serenity

The Serenity Operating System 🐞
https://serenityos.org
BSD 2-Clause "Simplified" License
30.54k stars 3.19k forks source link

Kernel: PTE loss not completely fixed #3370

Closed Lubrsi closed 4 years ago

Lubrsi commented 4 years ago

Not quite sure what caused it, I ran into it while profiling chess engine (it doesn't happen all the time):

[ChessEngine(26:26)]: kmalloc(): Adding memory to heap at V0xc483f000, bytes: 1048576
[ChessEngine(26:26)]: ASSERTION FAILED: result == AK::HashSetResult::InsertedNewEntry
../Kernel/VM/MemoryManager.cpp:238 in Kernel::PageTableEntry& Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress)
[ChessEngine(26:26)]: 0xc011e9a5  __assertion_failed(char const*, char const*, unsigned int, char const*) +185
[ChessEngine(26:26)]: 0xc018a2d5  Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress) +111
[ChessEngine(26:26)]: 0xc018d948  Kernel::Region::map_individual_page_impl(unsigned long) +86
[ChessEngine(26:26)]: 0xc018dde8  Kernel::Region::map(Kernel::PageDirectory&) +76
[ChessEngine(26:26)]: 0xc01617c7  Kernel::Process::allocate_region_with_vmobject(Kernel::Range const&, AK::NonnullRefPtr<Kernel::VMObject>, unsigned long, AK::String const&, int) +407
[ChessEngine(26:26)]: 0xc016fc0f  Kernel::Process::sys$mmap(AK::Userspace<Kernel::Syscall::SC_mmap_params const*>) +907
[ChessEngine(26:26)]: 0xc01679ff  syscall_handler +1339
[ChessEngine(26:26)]: 0xc0167435  syscall_asm_entry +49
[ChessEngine(26:26)]: 0x0805c600  malloc +1207
[ChessEngine(26:26)]: 0x0804aaf6  AK::SinglyLinkedList<AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::Entry>::append(AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::Entry&&) +24
[ChessEngine(26:26)]: 0x0804ac33  AK::SinglyLinkedList<AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::Entry>::append(AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::Entry const&) +47
[ChessEngine(26:26)]: 0x0804aecb  AK::HashTable<AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::Entry, AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::EntryTraits>::set(AK::HashMap<Chess::Board, int, AK::Traits<Chess::Board> >::Entry const&) [clone .isra.0]
+645
[ChessEngine(26:26)]: 0x0804afd6  Chess::Board::Board(Chess::Board const&) +252
[ChessEngine(26:26)]: 0x0804c771  Chess::Board::apply_illegal_move(Chess::Move const&, Chess::Colour) +33
[ChessEngine(26:26)]: 0x0804d7f8  Chess::Board::is_legal(Chess::Move const&, Chess::Colour) const +94
[ChessEngine(26:26)]: 0x0804974b  Chess::Board::generate_moves<MCTSTree::expand()::{lambda(Chess::Move)#1}>(MCTSTree::expand()::{lambda(Chess::Move)#1}, Chess::Colour) const::{lambda(Chess::Move)#1}::operator()(Chess::Move) const +27
[ChessEngine(26:26)]: 0x080498e3  Chess::Board::generate_moves<MCTSTree::expand()::{lambda(Chess::Move)#1}>(MCTSTree::expand()::{lambda(Chess::Move)#1}, Chess::Colour) const::{lambda(Chess::Square)#2}::operator()(Chess::Colour) const +359
[ChessEngine(26:26)]: 0x0804a388  MCTSTree::expand() +104
[ChessEngine(26:26)]: 0x0804a3e5  MCTSTree::do_round() +25
[ChessEngine(26:26)]: 0x08048de5  ChessEngine::handle_go(Chess::UCI::GoCommand const&) +205
[ChessEngine(26:26)]: 0x0805a99e  Core::Object::dispatch_event(Core::Event&, Core::Object*) +74
[ChessEngine(26:26)]: 0x08053790  Core::EventLoop::pump(Core::EventLoop::WaitMode) +308
[ChessEngine(26:26)]: 0x0805390f  Core::EventLoop::exec() +43
[ChessEngine(26:26)]: 0x080484ff  main +279
[ChessEngine(26:26)]: 0x0804896e  _start +94
awesomekling commented 4 years ago

cc @tomuta

tomuta commented 4 years ago

Not sure what's going on there, I tried to reproduce it but haven't had luck. Here's what I suspect is wrong: In MemoryManager::ensure_pte and MemoryManager::release_pte I think we should use m_kernel_page_directory.m_page_tables rather than page_directory.page_tables if page_directory_table_index == 3...

Lubrsi commented 4 years ago

I think I found a reproducible way. Basically, you need to get the memory usage very close to maximum but not enough to start purging. Then from that point, you need to slowly creep the memory usage up to maximum, where it starts purging and then it asserts.

Now that I have a somewhat consistent way of reproducing it, I'll see if that PR fixes it.

Here's my arbitrary way:

  1. Download this GIF and put it into Base/home/anon
  2. Rebuild disk image and startup.
  3. Open file manager and a browser.
  4. Drag the GIF into the browser, open a new tab and drag the GIF into the browser again.
  5. Profile the browser for about 3 seconds and then stop (keep the profiler open)
  6. Open a new tab, go to serenityos.org, then to ACID2 and finally the HTML spec.

Output:

[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc044a394}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc0462cf4}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc0462db4}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc048d314}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc04789d4}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc046bcd4}
[Browser(25:25)]: MM: Purge saved the day! Purged 9 pages from PurgeableVMObject{0xc0484194}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc047fab4}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc04c70d4}
[Browser(25:25)]: MM: Purge saved the day! Purged 9 pages from PurgeableVMObject{0xc04c71d4}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc047eef4}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc04d6d34}
[Browser(25:25)]: MM: Purge saved the day! Purged 16 pages from PurgeableVMObject{0xc04db1f4}
[Browser(25:25)]: MM: Purge saved the day! Purged 28 pages from PurgeableVMObject{0xc04db9d4}
[Browser(25:25)]: MM: Purge saved the day! Purged 9 pages from PurgeableVMObject{0xc04d9c54}
[Browser(25:25)]: MM: Purge saved the day! Purged 9 pages from PurgeableVMObject{0xc04df754}
[Browser(25:25)]: MM: Purge saved the day! Purged 721 pages from PurgeableVMObject{0xc04dfbb4}
[Browser(25:25)]: ASSERTION FAILED: result == AK::HashSetResult::InsertedNewEntry
../Kernel/VM/MemoryManager.cpp:238 in Kernel::PageTableEntry& Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&,
VirtualAddress)
[Browser(25:25)]: 0xc011e9a5  __assertion_failed(char const*, char const*, unsigned int, char const*) +185
[Browser(25:25)]: 0xc018a2d5  Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress) +111
[Browser(25:25)]: 0xc018d948  Kernel::Region::map_individual_page_impl(unsigned long) +86
[Browser(25:25)]: 0xc018dde8  Kernel::Region::map(Kernel::PageDirectory&) +76
[Browser(25:25)]: 0xc01617c7  Kernel::Process::allocate_region_with_vmobject(Kernel::Range const&, AK::NonnullRefPtr<Ker
nel::VMObject>, unsigned long, AK::String const&, int) +407
[Browser(25:25)]: 0xc016fc0f  Kernel::Process::sys$mmap(AK::Userspace<Kernel::Syscall::SC_mmap_params const*>) +907
[Browser(25:25)]: 0xc01679ff  syscall_handler +1339
[Browser(25:25)]: 0xc0167435  syscall_asm_entry +49
[Browser(25:25)]: 0x081b67b3  malloc +1207
[Browser(25:25)]: 0x0805c467  Web::DOM::Element::create_layout_node(Web::CSS::StyleProperties const*) +533
[Browser(25:25)]: 0x080ab42b  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +28
[Browser(25:25)]: 0x080ab4d1  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +194
[Browser(25:25)]: 0x080ab4d1  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +194
[Browser(25:25)]: 0x080ab4d1  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +194
[Browser(25:25)]: 0x080ab4d1  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +194
[Browser(25:25)]: 0x080ab4d1  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +194
[Browser(25:25)]: 0x080ab4d1  Web::create_layout_tree(Web::DOM::Node&, Web::CSS::StyleProperties const*) +194
[Browser(25:25)]: 0x080ab886  Web::LayoutTreeBuilder::build(Web::DOM::Node&) +302
[Browser(25:25)]: 0x0805a8f5  Web::DOM::Document::layout() +67
[Browser(25:25)]: 0x080b43ac  Web::Frame::set_document(Web::DOM::Document*) +112
[Browser(25:25)]: 0x080ae3e6  Web::FrameLoader::resource_did_load() +378
[Browser(25:25)]: 0x080afddb  Web::Resource::for_each_client(AK::Function<void (Web::ResourceClient&)>) +413
[Browser(25:25)]: 0x080b02fe  Web::Resource::did_load(AK::Badge<Web::ResourceLoader>, AK::ByteBuffer const&, AK::HashMap
<AK::String, AK::String, AK::CaseInsensitiveStringTraits> const&) +1162
[Browser(25:25)]: 0x080b0bbf  AK::Function<void (AK::ByteBuffer const&, AK::HashMap<AK::String, AK::String, AK::CaseInse
nsitiveStringTraits> const&)>::CallableWrapper<Web::ResourceLoader::load_resource(Web::Resource::Type, Web::LoadRequest
const&)::{lambda(auto:1&, auto:2&)#1}>::call(AK::ByteBuffer const&, AK::HashMap<AK::String, AK::String, AK::CaseInsensit
iveStringTraits> const&) const +45
[Browser(25:25)]: 0x080b10a1  AK::Function<void (bool, AK::ByteBuffer const&, AK::RefPtr<AK::SharedBuffer>, AK::HashMap<
AK::String, AK::String, AK::CaseInsensitiveStringTraits> const&, AK::Optional<unsigned int>)>::CallableWrapper<Web::Reso
urceLoader::load(AK::URL const&, AK::Function<void (AK::ByteBuffer const&, AK::HashMap<AK::String, AK::String, AK::CaseI
nsensitiveStringTraits> const&)>, AK::Function<void (AK::String const&)>)::{lambda(bool, AK::ByteBuffer const&, auto:1,
auto:2&, auto:3)#4}>::call(bool, AK::ByteBuffer const&, AK::RefPtr<AK::SharedBuffer>, AK::HashMap<AK::String, AK::String
, AK::CaseInsensitiveStringTraits> const&, AK::Optional<unsigned int>) const +279
[Browser(25:25)]: 0x080ec0fd  Protocol::Download::did_finish(AK::Badge<Protocol::Client>, bool, AK::Optional<unsigned in
t>, unsigned int, int, IPC::Dictionary const&) +749
[Browser(25:25)]: 0x080e8d4e  Protocol::Client::handle(Messages::ProtocolClient::DownloadFinished const&) +134
[Browser(25:25)]: 0x080e9873  ProtocolClientEndpoint::handle(IPC::Message const&) +53
[Browser(25:25)]: 0x080eacdc  IPC::ServerConnection<ProtocolClientEndpoint, ProtocolServerEndpoint>::handle_messages() +
106
[Browser(25:25)]: 0x081ac8ef  Core::EventLoop::pump(Core::EventLoop::WaitMode) +265
[Browser(25:25)]: 0x081aca99  Core::EventLoop::exec() +43
[Browser(25:25)]: 0x08048937  main +1335
[Browser(25:25)]: 0x0804e2be  _start +94
awesomekling commented 4 years ago

Not sure what's going on there, I tried to reproduce it but haven't had luck. Here's what I suspect is wrong: In MemoryManager::ensure_pte and MemoryManager::release_pte I think we should use m_kernel_page_directory.m_page_tables rather than page_directory.page_tables if page_directory_table_index == 3...

That is sane, but it sounds like something we should assert, not necessarily handle at runtime! Or.. how do we end up trying to put a kernel page into a userspace process page directory?

tomuta commented 4 years ago

Not sure what's going on there, I tried to reproduce it but haven't had luck. Here's what I suspect is wrong: In MemoryManager::ensure_pte and MemoryManager::release_pte I think we should use m_kernel_page_directory.m_page_tables rather than page_directory.page_tables if page_directory_table_index == 3...

That is sane, but it sounds like something we should assert, not necessarily handle at runtime! Or.. how do we end up trying to put a kernel page into a userspace process page directory?

I think we have to handle that at runtime. Unless you somehow turn PageDirectory::m_directory_pages[3] into a reference that points to a global one for the kernel address space >= 0xc0000000.

awesomekling commented 4 years ago

What I mean is I think all the code paths that would add mappings >3G are already explicitly adding them to kernel_page_directory

tomuta commented 4 years ago

So I think my change in #3371 does address a legitimate page table loss issue for kernel mappings. But I think there's a separate issue related to purging pages in low-memory conditions, which I think I already reported in #2741. It seems like some pages get purged that shouldn't be (maybe it doesn't respect madvise protection properly?)

tomuta commented 4 years ago

What I mean is I think all the code paths that would add mappings >3G are already explicitly adding them to kernel_page_directory

Oh I see what you mean. And I think you're right, added this assertion to ensure_ptr and it doesn't get triggered:

ASSERT(page_directory_table_index != 3 || &page_directory == m_kernel_page_directory.ptr());
Lubrsi commented 4 years ago

Ok, I found an even easier way to reproduce this. Run "allocate 209 MiB" and then open the browser.

Doing this also reveals so many more assertions in memory manager.

(This is on latest master with the taskbar fix)

tomuta commented 4 years ago

So with the allocate 209 command it also seems fork related:

[SystemServer(5:5)]: ASSERTION FAILED: result == AK::HashSetResult::InsertedNewEntry
../Kernel/VM/MemoryManager.cpp:243 in Kernel::PageTableEntry& Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress)
[SystemServer(5:5)]: 0xc011e9a5  __assertion_failed(char const*, char const*, unsigned int, char const*) +185
[SystemServer(5:5)]: 0xc018ae3c  Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress) +912
[SystemServer(5:5)]: 0xc018dfbc  Kernel::Region::map_individual_page_impl(unsigned long) +86
[SystemServer(5:5)]: 0xc018e45c  Kernel::Region::map(Kernel::PageDirectory&) +76
[SystemServer(5:5)]: 0xc016d1c4  Kernel::Process::sys$fork(Kernel::RegisterState&) +1872
[SystemServer(5:5)]: 0xc01678a8  syscall_handler +946
[SystemServer(5:5)]: 0xc0167467  syscall_asm_entry +49
[SystemServer(5:5)]: MM: PURGING ->
Lubrsi commented 4 years ago

I also get it through execve:

[SystemServer(25:25)]: ASSERTION FAILED: result == AK::HashSetResult::InsertedNewEntry
../Kernel/VM/MemoryManager.cpp:238 in Kernel::PageTableEntry& Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress)
[SystemServer(25:25)]: 0xc011e9b7  __assertion_failed(char const*, char const*, unsigned int, char const*) +185
[SystemServer(25:25)]: 0xc018a315  Kernel::MemoryManager::ensure_pte(Kernel::PageDirectory&, VirtualAddress) +111
[SystemServer(25:25)]: 0xc018d98e  Kernel::Region::map_individual_page_impl(unsigned long) +86
[SystemServer(25:25)]: 0xc018de2e  Kernel::Region::map(Kernel::PageDirectory&) +76
[SystemServer(25:25)]: 0xc0188df0  Kernel::MemoryManager::allocate_kernel_region_with_vmobject(Kernel::Range const&, Kernel::VMObject&, AK::StringView const&, unsigned char, bool, bool) +388
[SystemServer(25:25)]: 0xc0188ec2  Kernel::MemoryManager::allocate_kernel_region_with_vmobject(Kernel::VMObject&, unsigned long, AK::StringView const&, unsigned char, bool, bool) +182
[SystemServer(25:25)]: 0xc016adf6  Kernel::Process::do_exec(AK::NonnullRefPtr<Kernel::FileDescription>, AK::Vector<AK::String, 0ul>, AK::Vector<AK::String, 0ul>, AK::RefPtr<Kernel::FileDescription>, Kernel::Thread*&, unsigned int&) +984
[SystemServer(25:25)]: 0xc016c226  Kernel::Process::exec(AK::String, AK::Vector<AK::String, 0ul>, AK::Vector<AK::String, 0ul>, int) +1792
[SystemServer(25:25)]: 0xc016c71d  Kernel::Process::sys$execve(AK::Userspace<Kernel::Syscall::SC_execve_params const*>) +637
[SystemServer(25:25)]: 0xc0167a3f  syscall_handler +1339
[SystemServer(25:25)]: 0xc0167475  syscall_asm_entry +49
tomuta commented 4 years ago

I wonder if PageDirectory::PageDirectory fails allocating some physical pages. Looks like we wouldn't fail fork/execve in that case.

tomuta commented 4 years ago

I wonder if PageDirectory::PageDirectory fails allocating some physical pages. Looks like we wouldn't fail fork/execve in that case.

Nope, not the case...

tomuta commented 4 years ago

Ok I think I know what's going on here. The problem is in MemoryManager::ensure_pte when it calls allocate_user_physical_page. In the case that allocate_user_physical_page needs to purge memory first, the act of purging triggers calls to quickmap_pd within allocate_user_physical_page, rendering the first call to quickmap_pd in MemoryManager::ensure_pte stale, which means pd and pde are no longer pointing at the same underlying page. Which means that we insert the page into page_directory.m_page_tables thinking pde is now set, but because we wrote it to the wrong memory location, the next subsequent call to MemoryManager::ensure_pte will again correctly remap pd by calling quickmap_pd to the correct physical page, and this time pde is most likely still unset again, so we do the same again and because we insert the same entry to page_directory.m_page_tables again the ASSERT rightfully triggers.

awesomekling commented 4 years ago

Great analysis, I think you're right :)

Two thoughts on fixing this:

tomuta commented 4 years ago

Well the easy fix (which I have already coded up) is adding an optional argument to allocate_user_physical_page that returns a boolean if any pages were purged, and if so we just need to call quickmap_pd again.

Right now MM relies on a global lock, so even with multiple processors only one would be able to call ensure_pte (or most other things) at a time. We could easily do something similar like quickmap_page/unquickmap_page and just have a bool that we flip.

tomuta commented 4 years ago

Unfortunately fixing this triggers other problems down the road ;-)

tomuta commented 4 years ago
* Maybe we could have multiple quickmap slots. It could be a basic circular queue. Seems useful for multi-processing as well.

Btw, we have this for regular quickmap already, every processor has its own slot.