bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
36.11k stars 3.56k forks source link

bevy_gizmos doc tests hang compute when run locally #12207

Open alice-i-cecile opened 8 months ago

alice-i-cecile commented 8 months ago

Bevy version

https://github.com/bevyengine/bevy/commit/b24ab2e9fba5b81aba15da97b6551a4e99aad40c

[Optional] Relevant system information

Windows

What you did

Run cargo run -p ci --doc.

What went wrong

Doc tests for initial crates pass, and then a dozen or so doc tests within bevy_gizmos get stuck, producing a warning that they've been running for more than 60 seconds.

My computer then runs out of RAM (over 12 GB!), freezes, and I have to hard reboot.

Additional information

Running with fewer threads via e.g. -j 4 improves responsiveness and slows down time-to-hang, but does not resolve the problem.

This works fine in CI, so it may be OS specific.

lynn-lumen commented 6 months ago

This issue is related to 'Microsoft® Incremental Linker' - the linker used by the rust compiler on some (most?) windows machines. The linkers in the following image use 18.8 GB of RAM! gizmos-doc-test-task-manager

The one thing that sets bevy_gizmos apart from other crates like bevy_ecs is the increased "usage" of 'Microsoft® Incremental Linker'. Most other doc-tests also use 'Microsoft® Incremental Linker' but it does not use alot of memory there. Instead 'link.exe' uses more memory but remains within acceptable bounds (<1GB). In addition, bevy_gizmos uses bevy_ecs::system::assert_is_system(system); alot in doc-tests. However, since other crates like bevy_ecs make excessive use of this assertion aswell and do not have this issue, I am not sure this is the cause of the problem.

No single doc-test in particular seems to be responsible for the increased memory usage, as removing any given set of doc-tests results in a decrease in memory usage more or less proportional to the amount of tests removed. Instead, running doc-tests for (very) large projects on windows machines seems to be the issue.

I was unable to find an "in code" fix for this, but there is a discussion on the rust forums that appears to suggest building for a different but compatible target or using rust-lld as the linker.

However, this problem does not seem to be limited to Windows only, as #11034 details a very similar (the same?) issue on a Linux machine.

alice-i-cecile commented 5 months ago

Further discussion on Discord.

For some reason --jobs 1 doesn't seem to actually limit the amount of tests being compiled, but -- --test-threads=1 does. It also seems that the linker is invoked for each test, and for bevy_gizmos tests it takes almost 2GB per invocation. By default cargo runs 16 of them (one per thread), making me quickly run out of RAM. I wonder why it doesn't happen with the earlier tests though

General conclusions seem to be the same as above, although it looks like https://github.com/rust-lang/rust/pull/123974 should resolve this.

SkiFire13 commented 5 months ago

In addition to the previous observation, rustdoc doesn't respect the linker option in Cargo's config.toml, you need to explicitly set it in the rustdocflags:

[target.x86_64-pc-windows-msvc]
rustdocflags = ["-Clinker=rust-lld.exe"]

This helps reduce the memory usage to 1/2 or 1/3, which is a huge improvement if it leads to avoiding swapping (depends on how much RAM your system has).

SkiFire13 commented 2 months ago

https://github.com/rust-lang/rust/pull/126245 has been merged, though unfortunately it will be enabled only with the 2024 edition.

roblesch commented 2 months ago

Same issue on Fedora 40. Running cargo ci creates many instances of ld which consume all available memory and cause a system crash. Setting rustdocflags to use lld fixed the problem. The nightly release also runs successfully since it uses rust-lld.

Rust version info

rustc 1.80.1 (3f5fd8dd4 2024-08-06)
binary: rustc
commit-hash: 3f5fd8dd41153bc5fdca9427e9e05be2c767ba23
commit-date: 2024-08-06
host: x86_64-unknown-linux-gnu
release: 1.80.1
LLVM version: 18.1.7

config.toml

[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-Clink-arg=-fuse-ld=lld"]
rustdocflags = ["-Clink-arg=-fuse-ld=lld"]
BenjaminBrienen commented 1 week ago

12th Gen Intel(R) Core(TM) i7-12700H

cargo test --doc

test result: ok. 45 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 17.96s

With cargo run -p ci -- doc there is an entirely different story. A whole bunch of crates pass and then...

   Doc-tests bevy_gizmos

running 45 tests
test crates\bevy_gizmos\src\arcs.rs - arcs::Gizmos<'w,'s,Config,Clear>::arc_2d (line 34) has been running for over 60 seconds
test crates\bevy_gizmos\src\arcs.rs - arcs::Gizmos<'w,'s,Config,Clear>::arc_3d (line 154) has been running for over 60 seconds
test crates\bevy_gizmos\src\arcs.rs - arcs::Gizmos<'w,'s,Config,Clear>::long_arc_2d_between (line 386) has been running for over 60 seconds
test crates\bevy_gizmos\src\arcs.rs - arcs::Gizmos<'w,'s,Config,Clear>::long_arc_3d_between (line 253) has been running for over 60 seconds
test crates\bevy_gizmos\src\arcs.rs - arcs::Gizmos<'w,'s,Config,Clear>::short_arc_2d_between (line 340) has been running for over 60 seconds
test crates\bevy_gizmos\src\arcs.rs - arcs::Gizmos<'w,'s,Config,Clear>::short_arc_3d_between (line 207) has been running for over 60 seconds
test crates\bevy_gizmos\src\arrows.rs - arrows::ArrowBuilder<'_,'_,'_,Config,Clear>::with_tip_length (line 37) has been running for over 60 seconds
test crates\bevy_gizmos\src\arrows.rs - arrows::Gizmos<'w,'s,Config,Clear>::arrow (line 114) has been running for over 60 seconds
test crates\bevy_gizmos\src\arrows.rs - arrows::Gizmos<'w,'s,Config,Clear>::arrow_2d (line 145) has been running for over 60 seconds
test crates\bevy_gizmos\src\arrows.rs - arrows::Gizmos<'w,'s,Config,Clear>::axes (line 175) has been running for over 60 seconds
test crates\bevy_gizmos\src\arrows.rs - arrows::Gizmos<'w,'s,Config,Clear>::axes_2d (line 208) has been running for over 60 seconds
test crates\bevy_gizmos\src\circles.rs - circles::Gizmos<'w,'s,Config,Clear>::circle (line 116) has been running for over 60 seconds
test crates\bevy_gizmos\src\circles.rs - circles::Gizmos<'w,'s,Config,Clear>::circle_2d (line 157) has been running for over 60 seconds
test crates\bevy_gizmos\src\circles.rs - circles::Gizmos<'w,'s,Config,Clear>::ellipse (line 36) has been running for over 60 seconds
test crates\bevy_gizmos\src\circles.rs - circles::Gizmos<'w,'s,Config,Clear>::ellipse_2d (line 77) has been running for over 60 seconds
test crates\bevy_gizmos\src\circles.rs - circles::Gizmos<'w,'s,Config,Clear>::sphere (line 199) has been running for over 60 seconds
test crates\bevy_gizmos\src\cross.rs - cross::Gizmos<'_,'_,Config>::cross (line 24) has been running for over 60 seconds
test crates\bevy_gizmos\src\cross.rs - cross::Gizmos<'_,'_,Config>::cross_2d (line 59) has been running for over 60 seconds
test crates\bevy_gizmos\src\curves.rs - curves::Gizmos<'w,'s,Config,Clear>::curve_2d (line 30) has been running for over 60 seconds
test crates\bevy_gizmos\src\curves.rs - curves::Gizmos<'w,'s,Config,Clear>::curve_3d (line 64) has been running for over 60 seconds

It uses all 32 GB of RAM and causes the system to freeze up/soft crash after about 5-10 minutes.