rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.96k stars 12.68k forks source link

Compiler hangs after 'write metadata stage' #37311

Closed mglagla closed 7 years ago

mglagla commented 8 years ago

Hello everyone,

while trying to refactor a small toy project (Link), I've hit a bug where the compiler never finishes compiling:

calcium $ cargo rustc -Ztime-passes -Ztime-llvm-passes
   Compiling calcium v0.5.0 (file:///data/Code/calcium)
warning: the option `Z` is unstable and should only be used on the nightly compiler, but it is currently accepted for backwards compatibility; this will soon change, see issue #31847 for more details

time: 0.002; rss: 43MB  parsing
time: 0.000; rss: 43MB  configuration
time: 0.000; rss: 43MB  recursion limit
time: 0.000; rss: 43MB  crate injection
time: 0.000; rss: 43MB  plugin loading
time: 0.000; rss: 43MB  plugin registration
time: 0.038; rss: 82MB  expansion
time: 0.000; rss: 82MB  maybe building test harness
time: 0.001; rss: 82MB  assigning node ids
time: 0.000; rss: 82MB  checking for inline asm in case the target doesn't support it
time: 0.000; rss: 82MB  complete gated feature checking
time: 0.000; rss: 82MB  collecting defs
time: 0.009; rss: 94MB  external crate/lib resolution
time: 0.000; rss: 94MB  early lint checks
time: 0.000; rss: 94MB  AST validation
time: 0.004; rss: 94MB  name resolution
time: 0.001; rss: 94MB  lowering ast -> hir
time: 0.001; rss: 98MB  indexing hir
time: 0.000; rss: 98MB  attribute checking
time: 0.000; rss: 98MB  language item collection
time: 0.000; rss: 98MB  lifetime resolution
time: 0.000; rss: 98MB  looking for entry point
time: 0.000; rss: 98MB  looking for plugin registrar
time: 0.001; rss: 98MB  region resolution
time: 0.000; rss: 98MB  loop checking
time: 0.000; rss: 98MB  static item recursion checking
time: 0.000; rss: 98MB  load_dep_graph
time: 0.008; rss: 98MB  type collecting
time: 0.000; rss: 98MB  variance inference
time: 0.050; rss: 104MB coherence checking
time: 0.012; rss: 104MB wf checking
time: 0.003; rss: 104MB item-types checking
time: 0.112; rss: 106MB item-bodies checking
time: 0.000; rss: 106MB drop-impl checking
time: 0.007; rss: 106MB const checking
time: 0.001; rss: 106MB privacy checking
time: 0.000; rss: 106MB stability index
time: 0.001; rss: 106MB intrinsic checking
time: 0.000; rss: 106MB effect checking
time: 0.002; rss: 106MB match checking
time: 0.001; rss: 106MB liveness checking
time: 0.003; rss: 108MB rvalue checking
time: 0.009; rss: 111MB MIR dump
time: 0.006; rss: 114MB MIR passes
time: 0.009; rss: 114MB borrow checking
time: 0.000; rss: 114MB reachability checking
time: 0.001; rss: 114MB death checking
time: 0.001; rss: 114MB stability checking
time: 0.000; rss: 114MB unused lib feature checking
warning: variant is never used: `MismatchingParens`, #[warn(dead_code)] on by default
  --> src/parse.rs:33:5
   |
33 |     MismatchingParens,
   |     ^^^^^^^^^^^^^^^^^

time: 0.009; rss: 114MB lint checking
time: 0.000; rss: 114MB resolving dependency formats
time: 0.006; rss: 114MB Prepare MIR codegen passes
  time: 0.019; rss: 114MB   write metadata
^C
calcium $ 

I had to stop because even after a minute it did not finish.

This is the offending commit on branch "no-alloc", the previous commit on branch master finished very quickly.

The only difference to master is this code in parse.rs, starting on line 97:

let mut ref_c = 1;
let mut sub_tokens = Vec::new();

while ref_c > 0 {
    let token = if let Some(t) = it.next() {
        t.clone()
    } else { return Err(ParseError::MismatchingParens) };

        match token {
            Token::Operator(Symbol::OpenParen) => {
                ref_c += 1;
            },
            Token::Operator(Symbol::CloseParen) => {
                ref_c -= 1;
            },
            _ => {},
        }

    if ref_c > 0 {
        sub_tokens.push(token);
    }
}

let mut sub_iter = sub_tokens.iter().peekable();
parse_expr(&mut sub_iter, 0)              

has been replaced with this code:

let mut ref_c = 1;
let mut sub_iter = it.filter_map(|t| {
    match *t {
        Token::Operator(Symbol::OpenParen) => {
            ref_c += 1;
        },
        Token::Operator(Symbol::CloseParen) => {
            ref_c -= 1;
        },
        _ => {},
    };

    if ref_c > 0 {
        Some(t)
    } else {
        None
    }
})
.fuse()
.peekable();

parse_expr(&mut sub_iter, 0)

rustc version is 1.12.0 (3191fbae9 2016-09-23)

eddyb commented 8 years ago
parse_expr(&mut sub_iter, 0)   

This is wrong because it results in wrapping It in an iterator, and when recursing back to the same line, it wraps that type again, etc. ad infinitum. I believe this is usually called "polymorphic recursion". The simplest fix would be to cast the iterator (at the cost of dynamic dispatch):

parse_expr(&mut sub_iter as &mut Peekable<Iterator<Item=&'a Token>>, 0)

A better solution would involve rewriting all of this code, but I'm not sure exactly how that can be done.

However, the compiler is supposed to error here, so the bug is that it goes into a loop instead.

nikomatsakis commented 8 years ago

Discussed in the compiler team meeting. This is likely a cycle in trans (possibly the collector?), quite likely triggered by a fn recursively calling itself with a derived closure, or otherwise growing infinitely. Perhaps we are missing a recursion detector in the cycle collector or elsewhere to detect infinite monomorphization problems?

Ah, @eddyb wrote that. =)

nikomatsakis commented 8 years ago

triage: P-high

arielb1 commented 8 years ago

This looks like another case of "slow exponential recursion", except instead of a function calling 2 different functions, it creates a type of double the size:

parse_prefix::<τ> requires
    parse_prefix::<Fuse<FilterMap<&mut Peekable<τ>, [parse_prefix closure]::<τ>>>>

This type, when interned, has O(n) size, but trait operations fold it, which requires O(2^n) time. That... actually looks like it could be a part of some realistic example. Not sure what to do.

arielb1 commented 8 years ago

Minified:

trait Mirror {
    type Image;
}

impl<T> Mirror for T { type Image = T; }

#[allow(unconditional_recursion)]
fn recurse<T>() { 
    recurse::<<(T, T) as Mirror>::Image>();
}

fn main() {
    recurse::<()>();
}
TimNN commented 8 years ago

This is a regression from 1.8.0 to 1.9.0.

On nightly, this was introduced between nightly-2016-03-18 and nightly-2016-03-20 (Changes).

arielb1 commented 8 years ago

What did 1.8.0 do then?

arielb1 commented 8 years ago

32080 is the probable suspect - it did replace half of the old trans bugs with new ones.

TimNN commented 8 years ago

Reported a recursion limit:

hang.rs:8:1: 10:2 error: reached the recursion limit during monomorphization
hang.rs: 8 fn recurse<T>() {
hang.rs: 9     recurse::<<(T, T) as Mirror>::Image>();
hang.rs:10 }
arielb1 commented 8 years ago

A variant using traits hangs on all versions:

trait Mirror {
    type Image;
}

impl<T> Mirror for T { type Image = T; }

trait Foo {
    fn recurse(&self);
}

impl<T> Foo for T {
    #[allow(unconditional_recursion)]
    fn recurse(&self) { 
        (self, self).recurse();
    }
}

fn main() {
    ().recurse();
}
mglagla commented 8 years ago

@eddyb

A better solution would involve rewriting all of this code, but I'm not sure exactly how that can be done.

Yeah, it was my first attempt at refactoring that part, so i expected the compiler to give me some errors. Instead it hung.

@TimNN

This is a regression from 1.8.0 to 1.9.0.

Did you mean @arielb1 first minified example? I just tried to build my project on 1.8.0 and it still hung (i.e. compiled for half an hour without finishing).

TimNN commented 8 years ago

@mglagla: Yes, that was @arielb1's first example.

arielb1 commented 8 years ago

Sure enough. All of these examples create an exponentially-long type, and if we do anything to it before the recursion limit is reached, we hang (in fact, my "reduced" example hangs while printing the overflow error message).

nikomatsakis commented 7 years ago

I wonder if we need another kind of limit -- something that caps the maximum size a type is allowed to be (measured...somehow). Then we can abort when we hit that limit instead?

brson commented 7 years ago

Retagging as a regression. I don't see a reason it's not. This specific symptom did not used to occur. Now it does.

eddyb commented 7 years ago

@nikomatsakis We can easily track "total number of children" for each Ty, that's one option.

arielb1 commented 7 years ago

This can be fixed by adding a limit to type sizes at the entry to monomorphization. Will do that.

brson commented 7 years ago

Just waiting on https://github.com/rust-lang/rust/pull/37789. Thanks @arielb1 !