rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
99.18k stars 12.81k forks source link

include_dir rustdoc segfault #56900

Closed Mark-Simulacrum closed 3 years ago

Mark-Simulacrum commented 5 years ago

https://crater-reports.s3.amazonaws.com/beta-1.32-1/beta-2018-12-05/reg/include_dir-0.2.1/log.txt

Dec 10 14:30:38.105 INFO [stderr] error: process didn't exit successfully: `rustdoc --test /source/src/lib.rs --crate-name include_dir -L dependency=/target/debug/deps -L native=/target/debug/build/backtrace-sys-53f83b3161bfc7b9/out -L dependency=/target/debug/deps --cfg 'feature="default"' --extern glob=/target/debug/deps/libglob-aa59bbe61ff26f4b.rlib --extern include_dir=/target/debug/deps/libinclude_dir-d38de47afd31d3f5.rlib --extern include_dir_impl=/target/debug/deps/libinclude_dir_impl-7bf5a142001021da.so --extern proc_macro_hack=/target/debug/deps/libproc_macro_hack-5640320867fde5d4.rlib` (signal: 11, SIGSEGV: invalid memory reference)
euclio commented 5 years ago

Looks like unbounded memory growth. And it's not just a problem with rustdoc, trying to build the integration test triggers the bug as well.

alexcrichton commented 5 years ago

Agreed, this doesn't reproduce locally for me and is likely within some sort of C library or something like that which may not handle malloc failure well

euclio commented 5 years ago

It reproduces for me. I don't think there's a C library involved, because it's just a proc-macro that uses proc-macro-hack. The failing test attempts to include the directory of the crate itself, and there was a recent change to how include_bytes! deals with inserting its own source file in #54517, so I wouldn't be surprised if this is legit.

pietroalbini commented 5 years ago

Ping @rust-lang/rustdoc, can someone look at this?

QuietMisdreavus commented 5 years ago

This test managed to take down my server! :laughing: Based on @euclio's comment, i don't think this is a problem specific to rustdoc, since i can at least see logs up to where it tries to compile the sample. cc @michaelwoerister and @mcr431 since https://github.com/rust-lang/rust/pull/54517 seems to be related.

matthew-russo commented 5 years ago

I'll give it a look tonight.

matthew-russo commented 5 years ago

@QuietMisdreavus @euclio Sorry if this is ignorant -- I haven't been helping with the compiler for that long -- but how do I reproduce this? I haven't used crater before and when I try to run the rustdoc test manually it won't compile due to unwrapping an error inside the macro but I can't see exactly where things are breaking and it seems my error is different than the one you're talking about.

QuietMisdreavus commented 5 years ago

@mcr431 This is the procedure i used to get the issue i did:

  1. With cargo clone, download a copy of the crate include_dir. (You can also clone the repo and check out the tag v0.2.1, i'm assuming that will have the same source as the published crate.)
  2. Run cargo test --doc in the downloaded copy, using the latest beta or nightly, or a locally-built compiler/rustdoc using the technique listed in CONTRIBUTING.md. This test hangs for me, consuming more and more memory until i have none left to use my system with.

I'm not totally sure what's going wrong, but i'm assuming it's in the implementation of the macros inside that crate, and an interaction with the new file-inclusion mechanism.

matthew-russo commented 5 years ago

@QuietMisdreavus Thanks, I was trying to manually run rustdoc with the test flags and I don't think i was properly linking all dependencies. I'll look at it today and see if I can find anything

euclio commented 5 years ago

FYI, if you're on Linux or Mac you can use ulimit to limit the memory available to the process so you can see the segfault without bringing down your system. I found that 2GB worked well.

matthew-russo commented 5 years ago

@QuietMisdreavus So I found that it works as expected when cloned from github but not when using cargo clone. The github version includes the include_dir_impl right in the source tree where with cargo clone it is an externally fetched crate. The source of include_dir is the same in both so I'm fairly certain there is a difference in the published include_dir_impl vs the local one. Will look into it a bit more tonight.

matthew-russo commented 5 years ago

I made no progress on this last night. The source is exactly the same for the include_dir_impl as well. I'm not too positive why it would behave differently when the crate is downloaded from crates rather than specified in a local path. I'll try to give it another look this weekend when i have some more time

Michael-F-Bryan commented 5 years ago

Hi all, someone recently mentioned (https://github.com/Michael-F-Bryan/include_dir/issues/34#issuecomment-466670418) that include_dir overflows when compiling a trivial example and I thought it might be related to this.

To see exactly what include_dir was generating I inserted a line to dump the generated tokens to a file. I'm able to compile and run the crate's doc-test in about a dozen seconds (rustc 1.34.0-nightly (00aae71f5 2019-02-25)) and get a 76556 byte file.

Trying to run rustfmt directly on that file causes my entire computer to lock up. Here is the (unformatted) output:

tokens.txt


In the meantime, are there any ways I can tweak the macro's implementation so it won't cause rustc to consume all of the computer's RAM?

At the moment, all the macro does is recursively traverse a directory tree and generate a Dir literal.

static DIR: Dir = include_dir!("stuff");

// expands to

static DIR: Dir = Dir {
    path: "stuff",
    files: &[
        File {
            path: r"stuff/jstree.min.js",
            contents: include_bytes!(r"stuff/jstree.min.js"),
        },
        File {
            path: r"stuff/test.txt",
            contents: include_bytes!(r"stuff/test.txt"),
        },
    ],
    dirs: &[],
};
matthew-russo commented 5 years ago

Sorry for the delay, this completely slipped my mind. Realistically I probably won't get to it this weekend, but I'll try messing around and see if I can get to the root of it.

QuietMisdreavus commented 5 years ago

I poked around for a little bit the other day, and couldn't reproduce the issue without using the full include_dir crate. My hunch (based on what @Michael-F-Bryan said about trying to emit the expanded crate) is that an include_bytes is being emitted by one macro, then its output is being processed by another one, so it has to take the time to process all the tokens of the bytes of a large file. The stack-overflow segfault doesn't seem to be present on 1.33.0 (which was beta when i tried it, but is stable now), but the issue is now that it takes a long time to process.

Xymist commented 5 years ago

I'm encountering something I think may be related to this issue - no segfault, but attempting to include_dir! a 500KB folder containing 17 separate files takes forever to cargo check with up to date nightly (I gave up after 70m22.850s according to time), forever to compile with stable (gave up after 25 minutes and change), and 12 seconds to compile with 1.31.0. Whatever the change in behaviour was from then to 1.32 to now, it seems consistent.

Dylan-DPC-zz commented 4 years ago

Marked p-medium according to t he prioritisation discussion

jyn514 commented 3 years ago

I can't replicate this on master. I tried git clone https://github.com/Michael-F-Bryan/include_dir && cd include_dir && git checkout v0.2.1 && cargo test --doc. Is this still an issue?

jyn514 commented 3 years ago

I'm going to close this, since it seems like people aren't running into it anymore. Feel free to reopen if you're still having trouble.