mozilla-services / minidump-stackwalk

Socorro breakpad minidump stackwalker
Mozilla Public License 2.0
21 stars 5 forks source link

symbol_url is wrong if it didn't come from the first in the list #9

Open willkg opened 3 years ago

willkg commented 3 years ago

If you pass in a list of server urls, then FetchSymbolFile will iterate over that list to find the server_url that has the symbol. That works super.

However, we don't capture which server_url was successful.

So when symbol_url is generated for the module, it picks the first server url in the list and often that's right, but there are a bunch of cases where that's wrong. For example, the symbol_url for modules where the symbol file is in the "try" location will always be wrong.

The code for FetchSymbolFile is here:

https://github.com/mozilla-services/minidump-stackwalk/blob/e8f983e6fc2c463072e0967e6cebf082680478b4/minidump-stackwalk/http_symbol_supplier.cc#L295-L332

The code for pulling the symbol_url from stats is here:

https://github.com/mozilla-services/minidump-stackwalk/blob/e8f983e6fc2c463072e0967e6cebf082680478b4/minidump-stackwalk/stackwalker.cc#L695-L697

I didn't find the code for where the symbol_url is set in stats.

willkg commented 3 years ago

This affects the links for modules in the "Modules" tab for crash reports on Crash Stats. If the module's sym file was in anything other than the public bucket (the first in the server url list), the link will be wrong.

willkg commented 3 years ago

Here's where the stats picks up the url:

https://github.com/mozilla-services/minidump-stackwalk/blob/e8f983e6fc2c463072e0967e6cebf082680478b4/minidump-stackwalk/http_symbol_supplier.cc#L126-L135

It's "guessing" by picking the first server url. It's complicated by the fact that we're not storing the url the symbol file came from when putting it in the cache. If the symbol file gets pulled from the cache, we don't know what the original url was and then it guesses and it's wrong some of the time.

I think we need to do this:

  1. Fix the case where the symbol file is downloaded--make sure the correct url is captured in stats. (Pretty sure this case needs to be fixed, but we should test it first.)
  2. Make sure we store the url somewhere alongside the cached file so that we know where the file came from.
  3. When pulling the symbol file from cache, make sure we populate the url with the data we stored with the cached file.
luser commented 3 years ago

Yeah, it's a little more nuanced—the URL is correct if the file isn't cached locally: https://github.com/mozilla-services/minidump-stackwalk/blob/e8f983e6fc2c463072e0967e6cebf082680478b4/minidump-stackwalk/http_symbol_supplier.cc#L322

If the file is cached on disk the code doesn't record the URL with the file so the code has no way of knowing from whence it originated.

It might be interesting for you to benchmark the stackwalker with local disk caching of symbols disabled. With everything running in a cloud service it's possible that fetching the symbols directly from S3/GCS/whatever is fast enough that caching them on local disk isn't worth the hassle.

willkg commented 3 years ago

Oh, that's an interesting idea. I'll look into that.