win32ss / supermium

Chromium fork for Windows XP/2003 and up
https://win32subsystem.live/supermium/
BSD 3-Clause "New" or "Revised" License
1.9k stars 62 forks source link

Semi-random FFMPEG/Huffman-related crashes (122.0.6261.85, 32-bit, WXP SP3) #411

Open toxanbi opened 3 months ago

toxanbi commented 3 months ago

Loading some web-pages causes Chromium thread to fall with unhandled exception. It is obvious, that crashes are not fully random, because an attempt to reload crashed tab will likely end up with a repeated crash (but not in 100% of tries). Looks like some content on a page causes problem (but I wasn't able to track down, which content in particular).

Here is how Supermium indicates such kind of crashes: bug-new-5

So, it tries to read some field at offset +64h of some struct/class using null pointer.

(The big problem of Supermium is that chrome.dll in its default supply is different from chrome.dll supplied with debug symbols, so all my previous records of problematic places in chrome.dll do not make sense, thank you very much for suppying debug symbols for different DLL build)

You can see the instruction (at 178F568A) that tries to perform this access on the screenshot above.

I've scrolled above until I can reach the prologue of that procedure and here is what I found: bug-new-5-bgn-of-proc It looks quite weird for me, because symbol with such strange name could not be emitted by C/C++ compiler. Looking though that proc, however, revealed references to entities such as ff_huff_quad_vlc.

Googing that identifier brought me to ffmpeg source code, or, more exactly, to this function: https://ffmpeg.org/doxygen/trunk/mpegaudiodec__template_8c_source.html#l00751

I see many similarities: ffmpeg_similarities

So, it seems that access violation exception is thrown from inside huffman_decode function. However, I tried to grep through all Supermium sources and I didn't find any ffmpeg-derived code. I don't know... Chromium codebase is such a huge thing, I am completely new to it. May be this code came from binary lib, statically linked to chromium core.

toxanbi commented 3 months ago

And, by the way, what is convinient way of debugging crashes? Debugger is registered in the system, but when chromium crashes, it shows dialog like this: bug-3 without a prompt to attach to the process with a debugger.

So, I need to open chrome.dll (attaching to an existing process would take longer and give nothing extre if you don't know which process and thread caused an exception and if process intrusion isn't aided by Windows) under the debugger and just go to an address mentioned in the message box.

win32ss commented 3 months ago

ffmpeg in Chromium is in a separate repository under the third_party directory (https://source.chromium.org/chromium/chromium/src/+/main:third_party/ffmpeg/;l=1;drc=e9fe61fd0dc3b731e9ee186f9dde74d986e3592d;bpv=1;bpt=0?q=ffmpeg&sq=&ss=chromium%2Fchromium%2Fsrc).

Did any crash dumps get written to the user profile directory (by default, located at %USERPROFILE%\Application Data\Supermium\User Data) in the subdirectory Crashpad\reports?

You may also want to try running the browser with switches --in-process-gpu --single-process to consolidate the renderer and GPU processes into the main process.

toxanbi commented 3 months ago

Did any crash dumps get written to the user profile directory (by default, located at %USERPROFILE%\Application Data\Supermium\User Data) in the subdirectory Crashpad\reports?

There is no such subdirectory at all (I am talking about Crashpad\reports)

You may also want to try running the browser with switches --in-process-gpu --single-process to consolidate the renderer and GPU processes into the main process.

For testing and catching the bugs that might be good idea, but I suspect it is less suitable for regular web surfing since any crash will burry all opened tabs.

I wonder if there is a command line switch to enable some internal mini-debugger within Chromium? Does Chromium have anything like, for example, VirtualDub has? virtualdub_embedded_debugger

win32ss commented 3 months ago

No, nothing like this. Just the Crashpad which usually writes crash dumps to the directory. However it seems peculiar that the exception handling is being bypassed and going to the system exception handler. Is this the original 122 release or the hotfix? The former had issues with exception handling on XP.

And yes, the consolidation of the processes will cause the whole browser to crash if such a condition occurs.

toxanbi commented 3 months ago

Is this the original 122 release or the hotfix?

I think it is original. image image

The only difference is that I renamed chrome.exe to supchrome.exe to distinguish them within taskmgr and/or kill Supermium by issuing taskkill /f /im:supchrome.exe without affecting my old good Chrome 49.

win32ss commented 3 months ago

Yes, that is the original version. The newer version has a more reliable exception handler.

toxanbi commented 3 months ago

The newer version has a more reliable exception handler.

I have installed it. Indeed, there is a difference: now it just silently crashes like this image without any details and system message boxes.

Crash reports are now created.

How to decode/view these crashdumps? Any info about the toolkit, any guides?

win32ss commented 3 months ago

The crash dumps are readable in VS2010 and above. Older versions of windbg seemed to be problematic with them, however.

A problem is that the symbols for chrome.dll are nearly 2 GB in size (smaller symbols lacked necessary information). Maybe they could be readable on XP with /3GB.

IDA-RE-things commented 3 months ago

(The big problem of Supermium is that chrome.dll in its default supply is different from chrome.dll supplied with debug symbols, so all my previous records of problematic places in chrome.dll do not make sense, thank you very much for suppying debug symbols for different DLL build)

@toxanbi , "thank you very much" -- it was sarcasm I think? :)

Read this: #338 . This is why there are 2 separate PDB-files now with 2 separate Crome DLL's as result. Because smaller can't be created from larger now.

The default Crome.dll matches https://github.com/win32ss/supermium/releases/download/v122-hf/supermium_122_r2_32_symbols.7z

And "strippeddown" PDB is in different archive with different PDB (smaller), and different Chrome.DLL https://github.com/win32ss/supermium/releases/download/v122-hf/chrome_32_stripped_pdb.7z And it can be read by XP native tools like WinDbg/ProcessExplorer.

toxanbi commented 3 months ago

Read this: https://github.com/win32ss/supermium/issues/338 .

Well, you have two PDBs: one is full, another is light-weight. But still didn't get, why you need two different DLLs? They are produced using different linkers? Okay, but why to produce both of them?

A problem is that the symbols for chrome.dll are nearly 2 GB in size (smaller symbols lacked necessary information). Maybe they could be readable on XP with /3GB.

Not aware of Chromium build process at all. But is it still possible to generate MAP file (plaintext)? I have an experience of reading and generating COFF-debug files (.DBG) without use of imagehlp.dll/dbghelp.dll. If you can produce an old-school MAP-files, then, I believe, a small utility that converts .MAP into a .DBG could be written. COFF debug files are pretty compact.

After all, is there a real technical reason to have all the guts inside single chrome.dll? Yet again, I am not aware of Chromium codebase. I just read few of source files (out of hundreeds or thousands?) and watched through the list of internal names in the debugger. But I can see that there are subsystems organized as C++ namespaces: like base, media, webrtc, views, v8, ui, printing, network, gpu, blink.

Can we put all these subsystems into separate DLLs? I believe cohesion level should allow to do this? The only problem I see is that everthing that is template-instantiated and was probably folded by COMDAT-folding technique would now be duplicated into each DLL. But, on the other hand, we will have say 10 or 20 relatively small DLLs having PDBs of reasonable size.

IDA-RE-things commented 3 months ago

Well, you have two PDBs: one is full, another is light-weight. But still didn't get, why you need two different DLLs? They are produced using different linkers? Okay, but why to produce both of them?

There are such MS-recommended methods of creation of "light-weight" PDB:

BUT: the pdbcopy can't do this with PDB's from Supermium v122 (while can do for v121).

Another bad thing: We dont use MSVC linker here, because Chromium switched at some time to Clang/LLD (compiler/linker), and LLD has no /pdbstripped switch.

And now it is called as "debug information level: 0, 1". Which as I understud from win32ss as we discussed in #338, its passed to Clang compiler, and to LLD linker. So as result, for different PDB's we need only full rebuild/recompiling/relinking of the code. Resulting in different DLL's

If you can produce an old-school MAP-files

I don't familiar with LLD linker, with which the new Chromium builds created, if it can produce map-files. This will be good if it can. If no, I think that it is possible to create MAP file from existing 1.8 Gig PDB. If existing tools can do this (and not throw with error :)) .

Can we put all these subsystems into separate DLLs?

I have same point of view. That components must be placed in separate DLL's, as goog practice in software development. Starting from Microsoft itself.

At least for debugging/testing releases.

Yes, the one drawback is that we can't do "Whole program link-time code-generation and optimization". :power: :)) But I dont think that this kind of optimization really need on this stage of developmement. While the bugs lives here..

As I remember, very old Chromium build system (scripts etc) allow that separate DLL builds.

I dont know what is now here, may be no one cares about this on they "x64 Win10" with clang/lld instead of old good MSVC compiler/linker

win32ss commented 3 months ago

In the end, it appears that Supermium will have to be built with is_component_build switch now, which would create a large quantity of DLLs. But it still creates chrome.dll with the same size as before, so I'll have to figure out how to cut chrome.dll out completely as that would allow for some big time savings in linking, and also to ensure that there are no remaining dependencies on chrome.dll in the browser itself.

But ultimately I think that a bug in ffmpeg is related to synchronization issues. I have updated some synchronization APIs since 122 was originally released.

progwrp_32.zip

toxanbi commented 3 months ago

Here is exception info and callstack bactrace extracted by WinDbg from MDMP file for that particular case discussed in this issue:

0:013> .ecxr
eax=0000013f ebx=0000000d ecx=00000000 edx=0000000d esi=00000000 edi=1f70e804
eip=178f4b8a esp=057fea54 ebp=057ff8f8 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00210202
chrome!decode_frame+0xa6a:
178f4b8a 0fbf1c91        movsx   ebx,word ptr [ecx+edx*4] ds:0023:00000034=????
0:013> k
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  
057ff8f8 124f8467 chrome!decode_frame+0xa6a [o:\third_party\ffmpeg\libavcodec\mpegaudiodec_template.c @ 1602]
057ff960 124f7477 chrome!decode_receive_frame_internal+0x3a7 [o:\third_party\ffmpeg\libavcodec\decode.c @ 637]
057ff970 12e57945 chrome!avcodec_send_packet+0x87 [o:\third_party\ffmpeg\libavcodec\decode.c @ 735]
057ff9dc 12e54d1b chrome!try_decode_frame+0x115 [o:\third_party\ffmpeg\libavformat\demux.c @ 2077]
057ffb58 17cec041 chrome!avformat_find_stream_info+0x44b [o:\third_party\ffmpeg\libavformat\demux.c @ 2760]
057ffb70 13050abe chrome!OT::BaseScript::get_base_coord+0x31 [o:\third_party\harfbuzz-ng\src\src\hb-ot-layout-base-table.hh @ 298]
057ffb88 130507b0 chrome!base::internal::ReturnAsParamAdapter<int>+0x2e [o:\base\task\post_task_and_reply_with_result_internal.h @ 23]
057ffba8 121a7d7a chrome!base::internal::Invoker<base::internal::BindState<void (*)(base::OnceCallback<int ()>, std::__Cr::unique_ptr<int,std::__Cr::default_delete<int> > *),base::OnceCallback<int ()>,base::internal::UnretainedWrapper<std::__Cr::unique_ptr<int,std::__Cr::default_delete<int> >,base::unretained_traits::MayNotDangle,0> >,void ()>::RunOnce+0x40 [o:\base\functional\bind_internal.h @ 904]
057ffc10 17045c89 chrome!base::internal::PostTaskAndReplyRelay::RunTaskAndPostReply+0x3a [o:\base\threading\post_task_and_reply_impl.h @ 45]
057ffc44 17045be3 chrome!blink::Element::PseudoStateChanged+0x49 [o:\third_party\blink\renderer\core\dom\element.cc @ 4948]
057ffc54 1740fb0b chrome!blink::HTMLAnchorElement::ParseAttribute+0x893 [o:\third_party\blink\renderer\core\html\html_anchor_element.cc @ 294]
057ffccc 1740e736 chrome!base::internal::TaskTracker::RunAndPopNextTask+0x12db [o:\base\task\thread_pool\task_tracker.cc @ 416]
057ffefc 12cb825d chrome!GrColorInfo::GrColorInfo+0x86 [o:\third_party\skia\src\gpu\ganesh\GrColorInfo.cpp @ 22]
057fff80 12cb7c85 chrome!base::internal::WorkerThread::RunWorker+0x4ed [o:\base\task\thread_pool\worker_thread.cc @ 430]
057fff90 1287e038 chrome!base::internal::WorkerThread::RunPooledWorker+0x15 [o:\base\task\thread_pool\worker_thread.cc @ 315]
057fffb4 7c80b729 chrome!base::`anonymous namespace'::ThreadFunc+0xe8 [o:\base\threading\platform_thread_win.cc @ 140]
057fffec 00000000 kernel32!BaseThreadStart+0x37
IDA-RE-things commented 3 months ago

If its related to video decoding process, as we can see. Then you must determine the video URL with such problem, and put it here to reproduce on other machines.

Also it will be nice to attach that .dmp file.

toxanbi commented 3 months ago

If its related to video decoding process, as we can see. Then you must determine the video URL with such problem, and put it here to reproduce on other machines.

As far as I can remember, the page caused browser to crash wasn't a page that is intended to show any video (like YouTube or Instagram page). It was a page with mainly textual information and some images. Of course, a video could be embedded somewhere within the page including an option with hidden player.

Also, pay attention that ffmpeg-stuff is called from within HarfBuzz library: which main goal is to work with Unicode bidirectional texts, fonts, glyphs and so on. I have downloaded HarfBuzz sources and, again, I can't find any mention of avformat_find_stream_info() call in it. As well as neither "ffmpeg" nor "avconv".

So, probably call stack reconstruction went wrong. Or there is some snake bug which corrupts browser's memory so that HarfBuzz-->FFMPEG call really happens, because, for example, function pointer or an offset as part of CALL instruction is being overwritten (even though it sounds a bit unrealistic).

Also it will be nice to attach that .dmp file.

It doesn't seems to be good idea for me, since DMP-file can contain sensitive information (like passwords, hashes, session keys or whatever).

IDA-RE-things commented 3 months ago

Yes, you are right. Sometime the Callstack unrolling detects false items with it. Because its optimized code and no EBP-based frames used. It may be garbage on the stack, stays here from previous function calls. So this one function from HarfBuzz may be garbage.

But other functions tell us that this is trying to decode video frame. You must examine page source, logs of downloaded url's and so on to find it.

win32ss commented 3 months ago

In the end, it appears that Supermium will have to be built with is_component_build switch now, which would create a large quantity of DLLs. But it still creates chrome.dll with the same size as before, so I'll have to figure out how to cut chrome.dll out completely as that would allow for some big time savings in linking, and also to ensure that there are no remaining dependencies on chrome.dll in the browser itself.

Supermium 122.0.6261.152 x86 is now being built with is_component_build.

IDA-RE-things commented 3 months ago

Supermium 122.0.6261.152 x86 is now being built with is_component_build.

Good news!

toxanbi commented 3 months ago

Supermium 122.0.6261.152 x86 is now being built with is_component_build.

Where I can download it? Nothing new under "Releases" section.

Meanwhile, I have just got the crash discussed within this issue again. The instruction caused access violation and the backtrace is essentially the same (like FFMPEG is called from Harfbuzz).

Reloading the page does not reproduce the bug, so I was untable to examine callstack under the debugger and verify if it really has Harfbuzz--->FFMPEG call chain. However, the page that initiated a crash this time is exactly the same as it was in previous crash.

I presume the page contains some variable pieces (like ads which are different each time), and it is a matter of luck if this variable part of the page will or will not have some data that causes a crash.

IDA-RE-things commented 3 months ago

Supermium 122.0.6261.152 x86 is now being built with is_component_build.

Where I can download it? Nothing new under "Releases" section.

I think it would be good idea to place such package as "temporary internal debug preview only", without Setup. But with all PDBs inside. On filehosting sites like mega.co or google.drive. Or placed here in Releases on you choice.

So we can debug of crashes, enabling "--single-process", or do profiling in more simple way. Because it will be devided on modules with relative small size.

Also it would be good to disable optimization, related to "standard EBP-based stack frame" removing, in such release. (its nice for crash debugging and callstacks. But bad for profiling of couse. Because of additional overheads. But will see)

Or another way -- to use part of components in full-optimized variants, and parts with issues - in debug build.

The nice choice will be to do as Microsoft do it: FRE optimized DLLs, and CHK debug builds of same dll's.

win32ss commented 3 months ago

chrome.dll was reduced to 85 MB, but now there are 450 DLLs and many delay loads got changed to immediate loads, and a few components stopped using progwrp altogether. It will take some time to fix.

The uncompressed size of the browser increased from about 300 to 800 MB. 3-way split of chrome.dll would be better than the result here, along with restoration of the delay loader and perhaps static linking of MSVC runtime.