rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
94.86k stars 12.23k forks source link

No backtrace on windows with current rustc stable #122857

Closed TheBlueMatt closed 2 months ago

TheBlueMatt commented 3 months ago

Currently rustc beta in our CI is giving us backtraces like this with simple cargo tests (with Cargo.toml set for opt-level = 1):

0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: <unknown>
  10: <unknown>
  11: <unknown>
  12: <unknown>
  13: <unknown>
  14: BaseThreadInitThunk
  15: RtlUserThreadStart
ChrisDenton commented 3 months ago

Hm, I wonder if that's due to a change in debug information being generated. Does it work if you use debuginfo="line-tables-only"? EDIT: Oh the Cargo.toml would be debug = "line-tables-only"

TheBlueMatt commented 3 months ago

I tried RUSTFLAGS="-C debuginfo=line-tables-only" cargo test --features backtrace and it seems to still fail with the same backtrace.

workingjubilee commented 3 months ago

@TheBlueMatt What about nightly?

ChrisDenton commented 3 months ago

The reasons I can think of for a lack of symbols is:

The last case I think is the most unlikely because that should show up in our CI. The other cases are more likely caused by either a rustc or CI issue, in which case it should be reported on the rust-lang/rust repository. A possibly relevant PR I see is https://github.com/rust-lang/rust/pull/121297

workingjubilee commented 3 months ago

Oh yeah, the beta is a really recent nightly, right.

workingjubilee commented 3 months ago

Confirmed in rust-lang/backtrace-rs#594 that our CI for Windows is now broken, likely due to rustc changes. Transferring this to rust-lang/rust accordingly.

workingjubilee commented 3 months ago

cc @michaelwoerister sorry but it seems the combination of everything actually merging together did in fact break backtraces!

workingjubilee commented 3 months ago

For note it has not been confirmed in a beyond-a-shadow-of-a-doubt way that https://github.com/rust-lang/rust/pull/121297 is the guilty commit.

workingjubilee commented 3 months ago

hmm. the remarks in https://github.com/rust-lang/backtrace-rs/pull/594#issuecomment-2014406300 removes my confidence this is a stable->beta regression and not a stable->stable regression? could it have been https://github.com/rust-lang/cargo/pull/13257 instead? cc @Kobzol

people were relying on these backtraces working. but should they have not been? and how do PDBs work anyways?

ChrisDenton commented 3 months ago

Also cc @wesleywiser, as I'm wondering what the semantics of strip="debuginfo" should be. Currently it's treated the same as strip="symbols".

apiraino commented 3 months ago

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

Kobzol commented 3 months ago

This indeed seems to by caused by my Cargo PR to remove debuginfo in release. Do I understand it correctly that on Windows, strip = "debuginfo" also strips symbols? 😨

ChrisDenton commented 3 months ago

As it stands, yes. Well at least for msvc. It should also be noted that msvc debug information is always stored in a separate (.pdb) file so there's much less reason to strip debug info than there might otherwise be.

Kobzol commented 3 months ago

I would argue that strip = debuginfo shouldn't strip anything from the binary in that case, not everything.

michaelwoerister commented 3 months ago

To confirm, https://github.com/rust-lang/rust/pull/121297 as well as the backtrace-rs changes are in beta, so they could be the culprit. As far as I know, there have been a number of Windows-related changes in the new release (https://github.com/rust-lang/backtrace-rs/commits/0.3.70/).

-C strip=debuginfo on Windows currently does prevent the linker from generating a PDB file at all (as @ChrisDenton pointed out in https://github.com/rust-lang/backtrace-rs/pull/594#issuecomment-2014434355). This is a known issue that hasn't really been resolved yet (see https://github.com/rust-lang/rust/pull/115120) because it's not completely clear what the solution should be.

michaelwoerister commented 3 months ago

But if Cargo now passes -Cstrip=debuginfo when targeting *-pc-window-msvc then I would fully expect to see backtraces not working in the way described here. I'd suggest reverting that behavior for those platforms.

michaelwoerister commented 3 months ago

I can reproduce this on stable and beta for a simple hello world program. That means that the recent backtrace-rs and /PDBALTPATH changes are not the cause. It does not happen in 1.76 (previous stable). Also, removing the -Cstip=debuginfo argument on beta solves the problem.

I see two options:

I'm somewhat in favor of fixing this in rustc since that is what we want to do anyway.

workingjubilee commented 3 months ago

I assume that this has already reached the T-compiler agenda, but

michaelwoerister commented 3 months ago

I tried RUSTFLAGS="-C debuginfo=line-tables-only" cargo test --features backtrace and it seems to still fail with the same backtrace.

I suspect that is because Cargo does not look at the RUSTFLAGS and thus does not assume that debuginfo is enabled, which leads it to still pass -Cstrip=debuginfo to rustc. Modifying the Cargo profile should give a different result.

Kobzol commented 3 months ago

Yes. The heuristic only looks at the contents of Cargo.toml profiles of the crate and its dependencies. RUSTFLAGS are completely opaque to Cargo.

Now that I think about it, that can actually be a bit problematic, since -Cstrip apparently overrides -Cdebug, and since Cargo uses -Cstrip, it cannot be easily overridden by RUSTFLAGS (OTOH, things that can be configured in a profile probably shouldn't be set in the flags...).

Mark-Simulacrum commented 3 months ago

Hi all! Release is happy to do a stable point release, please ping me once we have a backport-safe patch (maybe just to cargo, to disable the new default stripping?)

Kobzol commented 3 months ago

If we want to do a point release quickly, I guess that we should disable the automatic stripping (preferably only on Windows though).

workingjubilee commented 3 months ago

@Kobzol what if someone has cross-compiling to Windows working tho'

Kobzol commented 3 months ago

I meant deciding based on the target for which we compile, not the host. That should be possible, or not?

workingjubilee commented 3 months ago

Oh okay. Probably yeah!

Kobzol commented 3 months ago

I tried to cook up something in https://github.com/rust-lang/cargo/pull/13630, but I'm not sure if it's the right approach.

ChrisDenton commented 3 months ago

@michaelwoerister It seems like #115120 will take time to design but there is perhaps an interim solution that's simpler and easier to backport. We could simply make strip = "debuginfo" the same as strip = "none" with the rationale that, whatever strip = "debuginfo" does or does not do, it should not be stripping symbols from anywhere.

Kobzol commented 3 months ago

An alternative to https://github.com/rust-lang/cargo/pull/13630 that could fix the problem could be something like this.

ChrisDenton commented 3 months ago

I think I'd suggest something more like this. This would lead to more consistent behaviour. We could additionally skip including natvis but I'm more "meh" on that, especially if the eventual intent is for strip to have no effect on external files.

michaelwoerister commented 3 months ago

I think it is clear by now that we don't want -Cstrip to touch PDB files or other separate debuginfo. So I think that @ChrisDenton's proposal is a move in the right direction.

However, given that the current documentation of -Cstrip mentions debuginfo not being copied to separate files, I think we need an FCP to change that behavior, which we know can take a long time.

That being the case, I think we should temporarily revert Cargo's defaults when targeting *-windows-msvc as @Kobzol proposes in https://github.com/rust-lang/cargo/pull/13630.

ChrisDenton commented 3 months ago

I created #123031 if you wanted to consider an fcp. In the meantime, backporting @Kobzol's proposal sounds good to me!

weihanglo commented 3 months ago

stable cargo backport is up #123105

VorpalBlade commented 3 months ago

Now that I think about it, that can actually be a bit problematic, since -Cstrip apparently overrides -Cdebug, and since Cargo uses -Cstrip, it cannot be easily overridden by RUSTFLAGS (OTOH, things that can be configured in a profile probably shouldn't be set in the flags...).

@Kobzol This might impact building packages for Linux distros too. I know Arch Linux build system sets RUSTFLAGS expecting it to take effect just like CFLAGS etc does. It is exposed in the user config at /etc/makepkg.conf too. I would expect other distros to be impacted as well.

Kobzol commented 3 months ago

You can still overwrite the defaut by setting RUSTFLAGS="-Cstrip=none".

apiraino commented 3 months ago

I will remove the t-compiler nomination since a decision to fix this in cargo was reached way faster, so I guess there is not a lot to discuss right now :-)

@rustbot label -I-compiler-nominated

ChrisDenton commented 3 months ago

Discussing what rustc should do has moved to https://github.com/rust-lang/rust/pull/115120#issuecomment-2022352769

retep998 commented 3 months ago

Windows MSVC has debuginfo in a separate file, and thus stripping provides no benefits to binary size. Given generating the .pdb file does not take that much time, and how useful they are for backtraces, you should thus always default to generating the .pdb on Windows MSVC regardless of debug vs release.

This is something that was already figured out years before but I guess people just forget after a while.

ChrisDenton commented 3 months ago

The "striping" behaviour of rustc has existed for four years now https://github.com/rust-lang/rust/commit/a6c2f73b6e0d24af5396355886b26bb23885c37e. I don't think Cargo was necessarily wrong to use strip cross-platform but rustc should have treated it as a no-op because, as you say, there's nothing to strip from the binary.

retep998 commented 3 months ago

And I think that PR from four years ago was wrong to do so. Someone brought up the reason that people might want to not emit PDB files on Windows (https://github.com/rust-lang/rust/pull/71825#discussion_r419868766), but if you look into the linked issue (https://github.com/rust-lang/rust/issues/67012) you'll see that they actually just didn't want the full path to the PDB file embedded into the binary. However, that reason was solved by a different PR fairly recently (https://github.com/rust-lang/rust/pull/121297). I even stated in that issue (https://github.com/rust-lang/rust/issues/67012) that Rust always emitting a PDB file was intended behavior, and yet that wasn't considered by the people in the split implementation PR! As in an earlier PR we intentionally made the conscious decision to emit /DEBUG unconditionally (https://github.com/rust-lang/rust/pull/28505), and yet nobody thought to ask why or consult a Windows subject matter expert about it.

Thus we should stop having strip affect PDB generation on Windows MSVC at all. It was wrong to do in the first place, but it's not too late to fix it.

workingjubilee commented 3 months ago

This was fixed in:

Nightly already landed the original cargo fix. Nightly should also be landing #115120 sometime soon (for very expansive definitions of "soon") which would allow us to revert the cargo workaround at our leisure.

apiraino commented 2 months ago

can this be closed now?

TheBlueMatt commented 2 months ago

Our CI started working again, so I assume so :man_shrugging:

lqd commented 2 months ago

Fixes have landed on stable, beta, and nightly. Closing.