lovell / sharp

High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.
https://sharp.pixelplumbing.com
Apache License 2.0
28.98k stars 1.29k forks source link

Is it possible to get (or compile) .pdb file for sharp.node for specfic sharp version? #3569

Closed TomaterID closed 1 year ago

TomaterID commented 1 year ago

Question about an existing feature

What are you trying to achieve?

I am currently using sharp in my photo management application Tonfotos. Library does great job, thank you guys!

However, it crushes from time to time at customer side. I am collecting crash dumps, but unfortunately I cannot get much of useful information from it as I don't have .pdb file corresponding to the binary that is being shipped with npm module. Best stack trace I can get looks like that:

(... skipped windows internals...) KERNELBASE!RaiseException+0x69 sharp_win32_x64+0x294cc sharp_win32_x64+0xbac6 sharp_win32_x64+0x421e2 sharp_win32_x64+0x2d680 sharp_win32_x64+0x2c5dd (... more skipped ....)

Obviously somehting is happened inside sharp, but we will never know what exactly and where until we have corresponding .pdb file. I am using sharp 0.29.3, and I would love to update, if I would be able to check if that bug was indeed fixed in later versions. But this amount of information is far from enought to find similar issues here. Having name of the failing function will be a great help. I would be able to see the code changes history and make educated decision about version to chose.

Last resort would be to upgrade to latest and pray, but that would be a lottery. This will not guarantee that this issue will go, but would give non-zero probability of adding new issues, too. (No offense, it is just the way life is.)

When you searched for similar issues, what did you find that might be related?

Unfortunately, I was not able to find relevant discussions. Everything with 'pdb' keyword looks to be installation issue.

Please provide a minimal, standalone code sample, without other dependencies, that demonstrates this question

Hmm... not sure this is relevant to my question.

Please provide sample image(s) that help explain this question

I have no idea what causes crash, is it connected to any images, etc. It is happening at customer side and all I get is crash dump from crashpad.

So I would be greateful for any help:

1) Instructions how do I build sharp.pdb for my version of sharp 2) Or, mabe there is already prebuilt one somewhere 3) Or maybe you know for sure this bug was already fixed and can point me to the right version to upgrade to 4) Or any other suggestions.

Thank you very much!

lovell commented 1 year ago

Rather than the sharp binary, I suspect this will be coming from one of libvips dependencies. This means you'll need to build your own debug binaries for the specific version of libvips you're using with https://github.com/libvips/build-win64-mxe and its --with-debug option.

I'm 80% certain this will relate to font discovery/rendering. The next v0.32.0 release of sharp will include prebuilt binaries that include the latest cairo, pango and fontconfig, all of which have seen a lot of font-related fixes and improvements in the last year, so waiting might be easier.

TomaterID commented 1 year ago

Thank you for your reply! My next question was about libvips, since I have crashes with this one too, and now I will try building it myself and saving PDB file for future. However, my original question was indeed about sharp, as it does crash. Any ideas how to debug this?

lovell commented 1 year ago

Please can you provide a full stack trace of a crash that you believe is caused directly by sharp (as opposed to a crash from one of libvips dependencies where a call to sharp happens to appear in the call stack).

TomaterID commented 1 year ago

Sure. This archive includes sample .dmp file as well as stack trace decoded by cdb.exe sharp_win32_x64+0x294cc.zip

lovell commented 1 year ago

Thanks, the salient part appears to be:

0000007a`4a7f9f80 00007ff7`1c4f0d37     : 00000000`00000002 00007ff8`00000000 0000007a`c00000bb 00000000`00000000 : KERNELBASE!SleepEx+0x9e
0000007a`4a7fa020 00007ff8`d83a0207     : 00000000`00000000 00007ff7`1c4f0c70 00000000`00000000 aaaa0064`6c6f6873 : tonfotos!crashpad::`anonymous namespace'::UnhandledExceptionHandler+0xc7
0000007a`4a7fa1a0 00007ff8`da655530     : 0000007a`4a7fa3e0 00007ff8`da6f7658 00000000`00000000 0000007a`4a7fa378 : KERNELBASE!UnhandledExceptionFilter+0x1e7
0000007a`4a7fa2c0 00007ff8`da63c876     : 00007ff8`da724a24 00007ff8`da5b0000 0000007a`4a7fa3e0 00007ff8`da5e0e7b : ntdll!memset+0x13b0
0000007a`4a7fa300 00007ff8`da65241f     : 00000000`00000000 0000007a`4a7fa8e0 0000007a`4a7fb340 00000000`00000000 : ntdll!_C_specific_handler+0x96
0000007a`4a7fa370 00007ff8`da6014a4     : 00000000`00000000 0000007a`4a7fa8e0 0000007a`4a7fb340 00000000`00000001 : ntdll!_chkstk+0x11f
0000007a`4a7fa3a0 00007ff8`da6011f5     : 00000000`00000000 0000007a`4a7fb1e0 00000000`00000000 0000007a`4a7faaf0 : ntdll!RtlRaiseException+0x434
0000007a`4a7faab0 00007ff8`d82bcd29     : 0000007a`4a7fb388 00007ff8`b33e7020 0000007a`4a7fb4a0 0000007a`4a7fb4b8 : ntdll!RtlRaiseException+0x185
*** WARNING: Unable to verify checksum for sharp-win32-x64.node
0000007a`4a7fb320 00007ff8`b33b94cc     : 00006832`00e3df80 00000000`00000000 0000007a`4a7fc370 00007ff8`b3392154 : KERNELBASE!RaiseException+0x69
0000007a`4a7fb400 00007ff8`b339bac6     : 0000007a`4a7fd960 0000007a`4a7fd7b0 0000007a`4a7fd960 0000007a`4a7fd7b0 : sharp_win32_x64+0x294cc
0000007a`4a7fb460 00007ff8`b33d21e2     : 0000007a`4a7fc370 00007ff8`b33ba0a5 0000007a`4a7fb658 0000007a`4a7fd7b0 : sharp_win32_x64+0xbac6
0000007a`4a7fb560 00007ff8`b33bd680     : 00007ff8`b33d21cc 0000007a`4a7fda60 0000007a`4a7fda60 aaaaaaaa`aaaaaaaa : sharp_win32_x64+0x421e2
0000007a`4a7fb5a0 00007ff8`b33bc5dd     : 00007ff8`b33d21cc 0000007a`4a7fc4d8 0000034f`00000100 00007ff7`1a5a311c : sharp_win32_x64+0x2d680
0000007a`4a7fb5d0 00007ff8`da6517a6     : 00000000`00000000 00000000`00000002 0912df01`00000000 0000007a`4a7fd7b0 : sharp_win32_x64+0x2c5dd
0000007a`4a7fb6b0 00007ff8`b33a01eb     : aaaaaaaa`aaaaaaaa 0000024d`85289300 00006832`00e3df80 0000034f`cfc6a7d1 : ntdll!RtlCaptureContext2+0x4a6 (TrapFrame @ 0000007a`4a7fba38)
0000007a`4a7fda60 00007ff8`b339cb94     : 00006832`0061fff0 00007ff7`1a541d49 0000034f`cfc6a721 00006832`01a6bd28 : sharp_win32_x64+0x101eb
0000007a`4a7fdb10 00007ff7`194955bc     : aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa : sharp_win32_x64+0xcb94
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!`anonymous namespace'::uvimpl::Work::AfterThreadPoolWork::<lambda_1>::operator
()+0x3e (Inline Function @ 00007ff7`194955bc)
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!napi_env__::CallIntoModule+0x4c (Inline Function @ 00007ff7`194955bc)
0000007a`4a7fdbb0 00007ff7`195d0388     : aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa 0000007a`4a7fdcb0 00007ff7`1f5a02e0 : tonfotos!`anonymous namespace'::uvimpl::Work::AfterThreadPoolWork+0xdc
0000007a`4a7fdc90 00007ff7`18fabcbc     : aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa : tonfotos!uv__work_done+0xc8
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!uv_process_reqs+0x16f (Inline Function @ 00007ff7`18fabcbc)
0000007a`4a7fdcf0 00007ff7`18f7872b     : 00000000`00000000 00006832`002d1080 00006832`0060e900 00006832`00708000 : tonfotos!uv_run+0x1ec
0000007a`4a7fed90 00007ff7`18f78b22     : 00000000`000007bc 00007ff8`00000000 00000000`00000000 00007ff7`1aa63023 : tonfotos!node::Environment::CleanupHandles+0x14b
0000007a`4a7fee00 00007ff7`18f4502c     : 00007ff7`1f4f6260 00007ff7`1f4f6260 00006832`00708380 00007ff7`18f785c5 : tonfotos!node::Environment::RunCleanup+0xc2
0000007a`4a7ff080 00007ff7`1798c4f4     : aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa 00007ff7`1f4f6260 aaaaaaaa`aaaaaaaa : tonfotos!node::FreeEnvironment+0x6c
0000007a`4a7ff0e0 00007ff7`1797d723     : aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa 00000000`00000030 0000007a`4a7ff248 : tonfotos!electron::NodeEnvironment::~NodeEnvironment+0x14
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!std::__1::default_delete<electron::NodeEnvironment>::operator()+0x8 (Inline Fu
nction @ 00007ff7`1797d723)
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!std::__1::unique_ptr<electron::NodeEnvironment,std::__1::default_delete<electr
on::NodeEnvironment> >::reset+0x19 (Inline Function @ 00007ff7`1797d723)
0000007a`4a7ff110 00007ff7`1846ca9a     : 0000007a`4a7ff270 0000007a`4a7ff538 00000000`00000000 0000007a`4a7ff268 : tonfotos!electron::ElectronBrowserMainParts::PostMainMessageLoopRun+0xc3
0000007a`4a7ff1c0 00007ff7`1846e2ed     : 00006832`002bc8a0 0000007a`4a7ff538 00000000`00000000 aaaaaaaa`aaaaaaaa : tonfotos!content::BrowserMainLoop::ShutdownThreadsAndCleanUp+0x1ca
0000007a`4a7ff2d0 00007ff7`18469bb8     : 0000007a`4a7ff4a0 00007ff7`1aed54ee 00006832`00291970 00006832`002408c0 : tonfotos!content::BrowserMainRunnerImpl::Shutdown+0xad

This looks like it might be an attempt to shutdown libuv whilst threads from its worker pool are still processing data. If you've not seen it, Electron provides an API to handle graceful shutdowns - https://www.electronjs.org/docs/latest/api/app#event-before-quit

The stack dump mentions Electron v14.2.6, which is EOL and uses an out-dated Node.js, so I'd recommend upgrading that bit first.

TomaterID commented 1 year ago

Sorry, I am not sure I understand. Yes, there is 'before quit' event in electron and I use it. Is there something I should also do with sharp library too in that call in order to avoid such crashes?

Updating Electron is huge effort as it contains lots of breaking changes. And it works just fine. In any case, it does not look like it is who is causing crashes, so I don't think this is the priority right now.

TomaterID commented 1 year ago

By the way, I provided you with just one crash dump, but the thing is, this is what happens regularly for different users. And the stack is always the same, it always has this sharp_win32_x64+0x294cc line.

lovell commented 1 year ago

Did you see sharp.counters() and the sharp.queue event emitter? These will allow you to check for any in-flight processing. You should prevent shutdown until these are all complete, i.e. counters are all zero.

https://sharp.pixelplumbing.com/api-utility#counters https://sharp.pixelplumbing.com/api-utility#queue

lovell commented 1 year ago

Upgrading Electron to a version with Node.js 16 would bring in https://github.com/nodejs/node/pull/35021 that might fix this.

kleisauke commented 1 year ago

Building a PDB locally for sharp v0.29.3 using this patch:

--- a/node_modules/sharp/binding.gyp
+++ b/node_modules/sharp/binding.gyp
@@ -205,7 +205,8 @@
               'VCCLCompilerTool': {
                 'ExceptionHandling': 1,
                 'Optimization': 1,
-                'WholeProgramOptimization': 'true'
+                'WholeProgramOptimization': 'true',
+                'DebugInformationFormat': 3 # Generate a PDB
               },
               'VCLibrarianTool': {
                 'AdditionalOptions': [

Produces this stack trace in WinDbg (the .symopt +0x40 option was used to load mismatched PDBs):

00 00007ff8`b33b94cc     : 00006832`00e3df80 00000000`00000000 0000007a`4a7fc370 00007ff8`b3392154 : KERNELBASE!RaiseException+0x69
01 00006832`00e3df80     : 00000000`00000000 0000007a`4a7fc370 00007ff8`b3392154 00007ff8`b3390000 : sharp_win32_x64!_invalid_parameter+0x9c [minkernel\crts\ucrt\src\appcrt\misc\invalid_parameter.cpp @ 112] 
02 00000000`00000000     : 0000007a`4a7fc370 00007ff8`b3392154 00007ff8`b3390000 00000000`19930520 : 0x00006832`00e3df80

So, it looks like you ran into #2999, which was fixed in sharp v0.30.0.

kleisauke commented 1 year ago

So, it looks like you ran into #2999, which was fixed in sharp v0.30.0.

Actually, the stack trace of that issue was:

Details ``` ucrtbase!invoke_watson+0x18 ucrtbase!_invalid_parameter_internal+0x39260 ucrtbase!invalid_parameter+0x2c ucrtbase!invalid_parameter_noinfo+0x9 ucrtbase!_get_osfhandle+0x3bf1f libvips_42!vips__open+0x51 libvips_42!vips_tracked_open+0xa libvips_42!vips_target_write_amp+0x655 libvips_42!vips_object_build+0x19 libvips_42!vips_target_new_to_file+0xae libvips_42!vips_jpegsave_mime+0x996 libvips_42!vips_object_build+0x19 libvips_42!vips_cache_operation_buildp+0x43 libvips_cpp!vips::VImage::call_option_string+0x80 [node_modules\sharp\src\libvips\cplusplus\VImage.cpp @ 536] libvips_cpp!vips::VImage::call+0x11 [node_modules\sharp\src\libvips\cplusplus\VImage.cpp @ 562] libvips_cpp!vips::VImage::jpegsave+0xea [node_modules\sharp\src\libvips\cplusplus\vips-operators.cpp @ 1777] sharp_win32_x64!PipelineWorker::Execute+0x5cb9 [node_modules\sharp\src\pipeline.cc @ 975] sharp_win32_x64!Napi::AsyncWorker::OnExecute+0x1e [node_modules\node-addon-api\napi-inl.h @ 4890] node!v8::base::SharedMutex::SharedMutex+0x8e8 node!uv_queue_work+0x28e node!uv_poll_stop+0xed node!inflateValidate+0x24c30 KERNEL32!BaseThreadInitThunk+0x1d ntdll!RtlUserThreadStart+0x28 ```

So, it's probably not related to that.

TomaterID commented 1 year ago

@lovell thank you for clarification. Actually, I have lots of other native stuff going on in background of my application, but I don't need to do any magic in 'before-quit' to wait for it to finish, as electron (the node core) just waits for all running tasks to finish before killing the process. I have pretty heavy computations for face recognition, and node waits for it to finish without issue. Therefore my quess is that in sharp case it will behave the same way, unless there is bug in sharp. My point is, rather than building strange workarounds it would make sense to figure out what exactly is wrong and maybe fix it.

As I can see from discussion (thank you guys!), there is no clarity on what exaclty this bug is. Therefore my plan probably should remain as it was:

Please let me know what you think. In the mean time I will try to figure out how to compile sharp.node when it is installed as npm library.

lovell commented 1 year ago

Were you able to add a call to sharp.counters() in the before-quit event handler to verify the queue length? If so, have you been able to confirm that this always zero when Electron shuts down all threads?

TomaterID commented 1 year ago

I don't believe this will give us anything regarding this issue for multiple reasons:

lovell commented 1 year ago

Commit https://github.com/lovell/sharp/commit/4ec883eaa0938b638c7c536ed81391b3e5d50238 addresses what could be the source of a possible race condition at shutdown, especially when using Node.js 14, as it sometimes becomes no longer possible to call back into JS from C++. This will be in v0.32.0.

Node.js 16 improves the situation too, via commits https://github.com/nodejs/node/commit/e326c41fbc3ac6533630704bf438db860731dcc6 and https://github.com/nodejs/node/commit/602fb3bedc50ede2054838c22371debc38a3f515

TomaterID commented 1 year ago

Quick update from my side. I managed to modify my build system so now I can actually have matching .node file in production build and .pdb file in my archive for future crash dump analysis. That was tricky since npm ci were overwriting every time locally built sharp.node with downloaded prebuilt binary. But I managed to work around that with few scripts.

Long story short, now I have full stack of the crash. However, it does not really shares more light (at least to me) on the reasons for this, as this all is happening during closing stage. But we already knew that.

In any case I am updating my production to 0.31.3 now and we will see if that will fix the that particular issue. So far in my tests everything seems to work fine and I don't see any incompatibilities. We will see about crashed after I will release it.

However, though I can now reproduce crash stack with symbols for sharp.node, apparently this is not the biggest issue. There is much more crashes inside libvips-42. And that one is much trickier to build, let alone build with PDB. I spent few hours figuring out how to build library and managed to do that using build-win-64, but I can't really go any further and build PDB, as I have zero knowledge about all the building tools you guys are using.

I hope some of errors will go away with updating version, but it is hard to believe that all of them will go, as there is wide variety of them.

I would like to ask your help with instructions on how to build libvips with PDB. I assume it will be beneficial for everyone if I would be able to provide you with crash dumps with symbols from actual production use.

However, ideal solution be, if you add PDB as target to your build system, so every time you publish .zip with prebuilt binaries to github, you would also put .zip with matching .PDB's. But that is more chirtmas wish, I would be grateful for any help, such as hint on how to build PDB's myself.

TomaterID commented 1 year ago

Status update. I have updated sharp to 0.31.3 (latest at that moment) and now enough time has passed sice my release to analyze the results. Overall, the amount of crashes seems to decrease couple times (though this is not an acurate calculation, rather than an guesstimate.)

Nevertheless, the issue that I described in the beginning of this thread is still there, thougn it is not that frequent now. I can provide dump files if you'd like.

Any suggestions?

Also, any ideas on how to make PDB's for libvips?

lovell commented 1 year ago

@TomaterID Are you still seeing this problem with the latest sharp v0.32.1?

TomaterID commented 1 year ago

I still see it on 0.31.3 which was latest at that moment. Please give me some to update to 0.32.1

TomaterID commented 1 year ago

Status update. I did some sorting of crash reports after updating to 0.32.1 I'm afraid, the problem is still there:

image

I can send dmp files if you want, there are many of them. I am actually getting owerwhelmed with sharp crashes (most of them in libvips though). As popularity of my application is growing I am getting less confident I can continue with sharp being so unreliable. Currenly it is responsible for 95%+ of all crashes on customers side of my application.

I would appreciate any thoughts and suggestions. Not only about this paricular crash, but also about all other crashes that come from libvips. I can provide you with lots of dmp files for debugging, but having no access to libvips pdb's or even not having opportunity to build ones myself I am not sure those dmp files will be of any use to anyone.

Please help.

lovell commented 1 year ago

Thanks for the updates. Which version of Electron are you using now? Does it include a version of Node.js with commits https://github.com/nodejs/node/commit/e326c41fbc3ac6533630704bf438db860731dcc6 and https://github.com/nodejs/node/commit/602fb3bedc50ede2054838c22371debc38a3f515 ?

TomaterID commented 1 year ago

I am currently using electron 14.2.6, not sure how to check if certain commit is there.

lovell commented 1 year ago

https://github.com/electron/electron/blob/v14.2.6/DEPS#L20 says Node.js v14.17.0, which would suggest not, unless it has been patched.

A possible side effect of not having commit https://github.com/nodejs/node/commit/e326c41fbc3ac6533630704bf438db860731dcc6 would look like the crash highlighted in https://github.com/lovell/sharp/issues/3569#issuecomment-1443747186

I highly recommend you upgrade to a version of Electron that uses Node.js 16+.

TomaterID commented 1 year ago

@lovell ok, thanks. I will update to electron 22 (last one to support winows 7/8/8.1 which is still significiant poriton of users). That will include lots of testing, but I had to do that anyway, sooner or later.

I hope that will help us to get rid of this particular issue at least, it accounts to roughly 25% of crashes, so that is a good start. I will let you know about results after a while.

TomaterID commented 1 year ago

Electron v.22.3.8, sharp v.0.32.1 the error is still there.

If you want, I can provide .dmp files.

image

kleisauke commented 1 year ago

Also, any ideas on how to make PDB's for libvips?

I'll look into producing PDBs by default for release builds, to facilitate post-mortem debugging with Windows Debugger. When building libvips' Windows binaries with the --with-debug option it produces debug info in DWARF format by default, this means you currently need to debug with GDB or LLDB.

Looking at the stack trace above, I'm not sure if this is a problem with libvips (or any of its dependencies), as I'm not seeing any libvips_42! references within the stack trace. AFAIK, this call stack information should also be present without having to recompile the binaries with debug symbols, e.g. the stack trace mentioned in https://github.com/lovell/sharp/issues/3569#issuecomment-1445048722 was done on the released binaries.

Without a reproduction it'll be hard to debug further. Does this crash only happen during shutdown? If so, it may be worth recompiling sharp with the NODE_API_SWALLOW_UNTHROWABLE_EXCEPTIONS definition, see: https://github.com/nodejs/node-addon-api/pull/975.

By default, throwing an exception on a terminating environment (eg. worker threads) will cause a fatal exception, terminating the Node process. This is to provide feedback to the user of the runtime error, as it is impossible to pass the error to JavaScript when the environment is terminating. In order to bypass this behavior such that the Node process will not terminate, define the preprocessor directive NODE_API_SWALLOW_UNTHROWABLE_EXCEPTIONS. https://github.com/nodejs/node-addon-api/blob/main/doc/setup.md

TomaterID commented 1 year ago

I'll look into producing PDBs by default for release builds, to facilitate post-mortem debugging with Windows Debugger.

Thank you very much, @kleisauke ! That would help a lot, since unfortunately, sharp is not yet stable enough and crashes a lot, and most of happens inside libvips. Unfortunately that happens on end-user side and all I can get is crash dumps. Having PDB's would simplify troubleshooting a lot.

Looking at the stack trace above, I'm not sure if this is a problem with libvips (or any of its dependencies), as I'm not seeing any libvips_42! references within the stack trace. AFAIK, this call stack information should also be present without having to recompile the binaries with debug symbols, e.g. the stack trace mentioned in #3569 (comment) was done on the released binaries.

You are totally right, this one happens inside sharp itself, that is why I started from reporitng it first. This is very frequent one, but this is just one of many, and most of others are inside libvips. I will be posting separate issues as I process those dumps and triage them (unfortunately, this very laborous process). One of them I have already posted: #3677 Please stay tuned for more.

Without a reproduction it'll be hard to debug further. Does this crash only happen during shutdown?

Crash dumps is all we have, unfortunately. I was not able to reproduce any of it myself and have zero idea about circumstances. I hoped that reading stack trace will give you more info about situation when it happens and hint about what would be the way to reproduce. @lovell had theory that this one of the already fixed bugs, but unfortiately I cannot confirm that - after updating sharp and electron it is still there.

However, if this one is only happening during shutdow, then it may appear not to be a real problem for users, unlike other crashes that happen during photo archive processing and are pretty annoying, as you have to restart program many times until your archive gets processed.

Anyway, I am open for your suggestions and ready to provide any support, including providing you with crash dumps. I would even love to debug myselft and push PR, but unfortunately without PDB's there is nothing I can do.

kleisauke commented 1 year ago

Commit https://github.com/libvips/build-win64-mxe/commit/676260ede1251da63a4fbe27f01441a858880183 adds support for this.

The statically-linked 64-bit libvips Windows binaries and corresponding PDB files built from that commit can be found here: https://libvips-packaging.s3.amazonaws.com/vips-dev-w64-web-8.14.2-static.zip https://libvips-packaging.s3.amazonaws.com/vips-pdb-w64-web-8.14.2-static.zip

Hope this helps.

TomaterID commented 1 year ago

@kleisauke thank you so much!

What do I need to do to make sure my production code has corresponding PDB files? Do I wait for next release or just replace binaries in latest version?

TomaterID commented 1 year ago

Have just submit another crash: #3679

If we can fix these three (#3569, #3677 and #3679) that would mean my app would crash roughly four times less often, which is a big deal. I am looking forward for more info on how to apply PDB's to production build so I can get actionable crash dumps. As you know you can only use PDB that was build togeter with your binary that crashed, so I need to release those ones first to the public, then we will be able to collect dumps that match those PDB's

kleisauke commented 1 year ago

What do I need to do to make sure my production code has corresponding PDB files?

As a best effort, I rebased the above commit on top of the released v8.14.2 binaries. You can find these PDB files here: https://libvips-packaging.s3.amazonaws.com/vips-pdb-w64-web-8.14.2-static-rebased.zip

However, I'm not sure if the Visual Studio debugger is able to associate these PDBs to the previous released binaries, since the COFF debug directory is not available on those DLLs. AFAIK, WinDbg would load these mismatched PDBs without any issues using the .symopt +0x40 option.

TomaterID commented 1 year ago

You can be sure that VC will not load those PDB's even if you build them from same sources. For some reason it builds different ones every time, and I guess that VC checks the checksums. Due to that reason I now build sharp myself for production in order to get matching PDB's.

I never heard about .symopt +0x40 option. I will check if I can make it work with existing dumps in WinDbg using it and will report here later.

TomaterID commented 1 year ago

However, I'm not sure if the Visual Studio debugger is able to associate these PDBs to the previous released binaries, since the COFF debug directory is not available on those DLLs. AFAIK, WinDbg would load these mismatched PDBs without any issues using the .symopt +0x40 option.

This approach seems to work fine with WinDbg, thank you! I will soon provide more info about crashes.

TomaterID commented 1 year ago

Just in case, fresh stack trace for this particular crash, using symopt trick I can have line numers now:

.  0  Id: 604.24a0 Suspend: 0 Teb: 00000001`2f274000 Unfrozen ""
Child-SP          RetAddr               : Args to Child                                                           : Call Site
00000001`2fbf99f8 00007ffc`5832b44e     : 00000001`2fbf9ab8 00000001`2fbfb950 00000000`00035fef 00000000`00000028 : ntdll!NtDelayExecution+0x14
*** WARNING: Unable to verify checksum for tonfotos.exe
00000001`2fbf9a00 00007ff7`f4a71387     : 00000000`00000002 00007ffc`00000000 00000001`c00000bb 00000000`00000000 : KERNELBASE!SleepEx+0x9e
00000001`2fbf9aa0 00007ffc`5840dd57     : 00000000`00000000 00007ff7`f4a712c0 00000000`00000000 00007ff7`f28b4d90 : tonfotos!crashpad::`anonymous namespace'::UnhandledExceptionHandler+0xc7 [C:\projects\src\third_party\crashpad\crashpad\client\crashpad_client_win.cc @ 186]
00000001`2fbf9c20 00007ffc`5a8f54f0     : 00000001`2fbf9e60 00007ffc`5a997618 00000000`00000000 00000001`2fbf9df8 : KERNELBASE!UnhandledExceptionFilter+0x1e7
00000001`2fbf9d40 00007ffc`5a8dc876     : 00007ffc`5a9c4a24 00007ffc`5a850000 00000001`2fbf9e60 00007ffc`5a880e7b : ntdll!memset+0x13b0
00000001`2fbf9d80 00007ffc`5a8f23df     : 00000000`00000000 00000001`2fbfa360 00000001`2fbfad40 00000000`00000000 : ntdll!_C_specific_handler+0x96
00000001`2fbf9df0 00007ffc`5a8a14a4     : 00000000`00000000 00000001`2fbfa360 00000001`2fbfad40 00000000`00000001 : ntdll!_chkstk+0x11f
00000001`2fbf9e20 00007ffc`5a8a11f5     : 00000000`00000000 00000001`2fbfabe0 00000000`00000000 00000001`2fbfa570 : ntdll!RtlRaiseException+0x434
00000001`2fbfa530 00007ffc`5830cf19     : 00000000`00000000 00007ffc`3a03bbe0 00000001`2fbfaef0 00000001`2fbfaef0 : ntdll!RtlRaiseException+0x185
*** WARNING: Unable to verify checksum for sharp-win32-x64.node
00000001`2fbfad20 00007ffc`3a00d2cc     : 00000001`2fbfd130 00002670`00020cc0 00000001`2fbfd430 64342d33`3139342d : KERNELBASE!RaiseException+0x69
00000001`2fbfae00 00007ffc`39fede33     : 00000001`2fbfd2e0 00002670`0015c008 00000001`2fbfd2e0 00000001`2fbfd130 : sharp_win32_x64!_CxxThrowException+0x90 [d:\a01\_work\12\s\src\vctools\crt\vcruntime\src\eh\throw.cpp @ 75]
00000001`2fbfae60 00007ffc`3a026182     : 00000001`2fbfbd70 00007ffc`3a00dea5 00000001`2fbfb058 00000001`2fbfd130 : sharp_win32_x64!Napi::Error::ThrowAsJavaScriptException+0xe3 [D:\Programming\tonfotos\node_modules\sharp\node_modules\node-addon-api\napi-inl.h @ 3077]
00000001`2fbfaf60 00007ffc`3a011480     : 00007ffc`3a02616c 00000001`2fbfd430 00000001`2fbfd430 0000020c`00e5a4b5 : sharp_win32_x64!`Napi::details::WrapCallback<<lambda_5b9db19950e03d93a469a94c39aaa749> >'::`1'::catch$5+0x16 [D:\Programming\tonfotos\node_modules\sharp\node_modules\node-addon-api\napi-inl.h @ 81]
00000001`2fbfafa0 00007ffc`3a0103dd     : 00007ffc`3a02616c 00000001`2fbfbed8 00000000`00000100 00002670`0015c130 : sharp_win32_x64!_CallSettingFrame_LookupContinuationIndex+0x20 [d:\a01\_work\12\s\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm @ 98]
00000001`2fbfafd0 00007ffc`5a8f1766     : 00000000`00000000 00002670`00000002 0000020c`00000000 00000001`2fbfd130 : sharp_win32_x64!__FrameHandler4::CxxCallCatchBlock+0x115 [d:\a01\_work\12\s\src\vctools\crt\vcruntime\src\eh\frame.cpp @ 1393]
00000001`2fbfb0b0 00007ffc`39ff291b     : 00000000`00000028 00000154`f7fedd30 00002670`00020cc0 00000001`2fbfd5a0 : ntdll!RtlCaptureContext2+0x4a6 (TrapFrame @ 00000001`2fbfb438)
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : sharp_win32_x64!Napi::AsyncWorker::OnWorkComplete::__l5::<lambda_5b9db19950e03d93a469a94c39aaa749>::operator()+0xd (Inline Function @ 00007ffc`39ff291b) [D:\Programming\tonfotos\node_modules\sharp\node_modules\node-addon-api\napi-inl.h @ 5195]
00000001`2fbfd430 00007ffc`39feec44     : 00000001`2fbfead0 00007ff7`f2646ef9 aaaaaaaa`00000000 00002670`000dc000 : sharp_win32_x64!Napi::details::WrapCallback<<lambda_5b9db19950e03d93a469a94c39aaa749> >+0x2b [D:\Programming\tonfotos\node_modules\sharp\node_modules\node-addon-api\napi-inl.h @ 79]
00000001`2fbfd4e0 00007ff7`f147edaa     : 00002670`01305658 00000000`00000001 00002670`01305600 00002670`00214000 : sharp_win32_x64!Napi::AsyncWorker::OnWorkComplete+0x44 [D:\Programming\tonfotos\node_modules\sharp\node_modules\node-addon-api\napi-inl.h @ 5193]
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!`anonymous namespace'::uvimpl::Work::AfterThreadPoolWork::<lambda_1>::operator()+0x40 (Inline Function @ 00007ff7`f147edaa) [C:\projects\src\third_party\electron_node\src\node_api.cc @ 1108]
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!napi_env__::CallIntoModule+0x56 (Inline Function @ 00007ff7`f147edaa) [C:\projects\src\third_party\electron_node\src\js_native_api_v8.h @ 88]
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!node_napi_env__::CallbackIntoModule+0x56 (Inline Function @ 00007ff7`f147edaa) [C:\projects\src\third_party\electron_node\src\node_api.cc @ 82]
00000001`2fbfd580 00007ff7`f15baac2     : 00000000`00000000 00000000`00000000 00002670`00214870 00007ff7`f815f4b0 : tonfotos!`anonymous namespace'::uvimpl::Work::AfterThreadPoolWork+0xea [C:\projects\src\third_party\electron_node\src\node_api.cc @ 1107]
00000001`2fbfd670 00007ff7`f0eb7a2d     : 00000000`00000000 00000000`00000000 00007ff7`f815f3e8 00000000`00000000 : tonfotos!uv__work_done+0xc2 [C:\projects\src\third_party\electron_node\deps\uv\src\threadpool.c @ 312]
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!uv_process_reqs+0x161 (Inline Function @ 00007ff7`f0eb7a2d) [C:\projects\src\third_party\electron_node\deps\uv\src\win\req-inl.h @ 194]
00000001`2fbfd6d0 00007ff7`f0e89c53     : 00002670`0009c7e0 00002670`00167980 00000000`00000000 00002670`00214000 : tonfotos!uv_run+0x1dd [C:\projects\src\third_party\electron_node\deps\uv\src\win\core.c @ 619]
00000001`2fbfe770 00007ff7`f0e8a573     : 00000000`00000000 aaaaaaaa`aaaaaaaa aaaaaaaa`aaaaaaaa 00000000`6ca27b90 : tonfotos!node::Environment::CleanupHandles+0x163 [C:\projects\src\third_party\electron_node\src\env.cc @ 1033]
00000001`2fbfe7e0 00007ff7`f0e49ce2     : aaaaaaaa`aaaaaaaa 00007ff7`f736d890 00000000`00000021 00007ffc`58351568 : tonfotos!node::Environment::RunCleanup+0x223 [C:\projects\src\third_party\electron_node\src\env.cc @ 1087]
00000001`2fbfeab0 00007ff7`ef26f514     : 00000000`00000000 00000001`2fbfebe0 00000001`2fbfec10 00002670`00020000 : tonfotos!node::FreeEnvironment+0xb2 [C:\projects\src\third_party\electron_node\src\api\environment.cc @ 396]
00000001`2fbfeb80 00007ff7`ef25ca1f     : 00000000`00000000 00000000`00000000 00000000`00000040 00aaaaaa`aaaaaaaa : tonfotos!electron::NodeEnvironment::~NodeEnvironment+0x14 [C:\projects\src\electron\shell\browser\javascript_environment.cc @ 310]
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!std::Cr::default_delete<electron::NodeEnvironment>::operator()+0x8 (Inline Function @ 00007ff7`ef25ca1f) [C:\projects\src\buildtools\third_party\libc++\trunk\include\__memory\unique_ptr.h @ 49]
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!std::Cr::unique_ptr<electron::NodeEnvironment,std::Cr::default_delete<electron::NodeEnvironment> >::reset+0x19 (Inline Function @ 00007ff7`ef25ca1f) [C:\projects\src\buildtools\third_party\libc++\trunk\include\__memory\unique_ptr.h @ 281]
00000001`2fbfebb0 00007ff7`f0158d84     : 00000001`2fbfed40 00000001`2fbfed20 00000001`2fbfed50 00000001`2fbfed48 : tonfotos!electron::ElectronBrowserMainParts::PostMainMessageLoopRun+0x22f [C:\projects\src\electron\shell\browser\electron_browser_main_parts.cc @ 628]
00000001`2fbfeca0 00007ff7`f015a8ae     : aaaaaaaa`aaaaaaaa 00000000`00000000 00000000`67e9ed47 00000000`67df8e0a : tonfotos!content::BrowserMainLoop::ShutdownThreadsAndCleanUp+0x1c4 [C:\projects\src\content\browser\browser_main_loop.cc @ 1090]
00000001`2fbfedb0 00007ff7`f0155f28     : 00000000`00000010 00000000`003d0900 00000000`00000000 00000000`00000000 : tonfotos!content::BrowserMainRunnerImpl::Shutdown+0x8e [C:\projects\src\content\browser\browser_main_runner_impl.cc @ 191]
00000001`2fbfee50 00007ff7`ef3f51ef     : 0000d26d`7da4b02e 00000000`00000018 00000000`ffffffff 00007ff7`f737b440 : tonfotos!content::BrowserMain+0xd8 [C:\projects\src\content\browser\browser_main.cc @ 35]
00000001`2fbfef10 00007ff7`ef3f68b0     : 00000000`00000001 00007ff7`f0ecb368 aaaaaaaa`aaaaaaaa 00007ff7`f2c9af06 : tonfotos!content::RunBrowserProcessMain+0xdf [C:\projects\src\content\app\content_main_runner_impl.cc @ 716]
00000001`2fbff010 00007ff7`ef3f6258     : 00000000`00000001 00000000`00000008 00007ff7`f751000b 0000d26d`7da4afbe : tonfotos!content::ContentMainRunnerImpl::RunBrowser+0x610 [C:\projects\src\content\app\content_main_runner_impl.cc @ 1257]
00000001`2fbff190 00007ff7`ef3f2670     : 0000fee7`cdb0a857 00007ffc`5a8834f1 00000000`00000000 00007ff7`f751e858 : tonfotos!content::ContentMainRunnerImpl::Run+0x358 [C:\projects\src\content\app\content_main_runner_impl.cc @ 1116]
00000001`2fbff2d0 00007ff7`ef3f27af     : 00000001`2fbff601 00000154`cee84a48 00000154`cee84a50 00007ff7`f80fb8f0 : tonfotos!content::RunContentProcess+0x610 [C:\projects\src\content\app\content_main.cc @ 346]
00000001`2fbff530 00007ff7`ef16d0d1     : 00000000`00000010 00000001`2fbff6c1 00007ff7`eefa0000 00000000`00000000 : tonfotos!content::ContentMain+0x6f [C:\projects\src\content\app\content_main.cc @ 374]
00000001`2fbff5d0 00007ff7`f2fe3e62     : 00000000`00000000 00007ff7`f2fe3ed9 00000000`00000000 00000000`00000000 : tonfotos!wWinMain+0x3c1 [C:\projects\src\electron\shell\app\electron_main_win.cc @ 244]
*** WARNING: Unable to verify checksum for KERNEL32.DLL
(Inline Function) --------`--------     : --------`-------- --------`-------- --------`-------- --------`-------- : tonfotos!invoke_main+0x21 (Inline Function @ 00007ff7`f2fe3e62) [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 118]
00000001`2fbff7c0 00007ffc`59cc7614     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : tonfotos!__scrt_common_main_seh+0x106 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
00000001`2fbff800 00007ffc`5a8a26a1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
00000001`2fbff830 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
lovell commented 1 year ago

Thanks for the updates. Given you're compiling your own sharp, and if you've not already tried, please can you add NODE_API_SWALLOW_UNTHROWABLE_EXCEPTIONS to the defines here and see if it helps.

https://github.com/lovell/sharp/blob/3340120aeaf10811eeae97f524281ba86d78a453/binding.gyp#L72-L74

TomaterID commented 1 year ago

OK, will do. We'll have to wait for results.

lovell commented 1 year ago

I've added NODE_API_SWALLOW_UNTHROWABLE_EXCEPTIONS via commit https://github.com/lovell/sharp/commit/f5845c7e6172c71941ade4ccff5e2be6610029e1 - this will be part of v0.32.2

lovell commented 1 year ago

Closing as this was superceded by #3677