serde-rs / serde

Serialization framework for Rust
https://serde.rs/
Apache License 2.0
9.1k stars 769 forks source link

serde_derive takes too long to compile #1421

Closed gnzlbg closed 5 years ago

gnzlbg commented 5 years ago

This is a problem.

I don't have any suggestions about how to make this problem smaller, but my builds often just hang waiting for serde_derive to finish compiling.

Maybe splitting serde_derive into separate crates could make the issue better, but I don't know if this is possible.

dtolnay commented 5 years ago

Could you quantify how long it takes? It builds in 8 seconds on my machine.

Depending on what output you are looking at, any of the other crates being compiled in parallel with serde_derive may be what is hanging, not serde_derive.

gnzlbg commented 5 years ago

cargo build in serde_derive (all crates):

time cargo build
   Compiling proc-macro2 v0.4.21
   Compiling unicode-xid v0.1.0
   Compiling quote v0.6.10
   Compiling syn v0.15.18
   Compiling serde_derive v1.0.80 (/serde/serde_derive)
    Finished dev [unoptimized + debuginfo] target(s) in 41.36s

real    0m41.444s
user    0m55.461s
sys 0m3.064s

removing all serde_derive artifacts from target/ and recompiling with cargo build:

time cargo build
   Compiling serde_derive v1.0.80 (/serde/serde_derive)
    Finished dev [unoptimized + debuginfo] target(s) in 24.63s

real    0m24.726s
user    0m32.663s
sys 0m1.301s

I don't know how reliable this method is (CARGO_PROFILE does not show much), let me know if there is a better way to measure this, but serde_derive taking the same time as syn + quote puts it in the right ballpark on my machine.

matthiaskrgr commented 5 years ago

You can check with -Z time-passes: RUSTC_WRAPPER="" RUSTFLAGS="-Ztime-passes -C codegen-units=1" cargo build --release -j 1

   Compiling serde_derive v1.0.80 (/home/matthias/vcs/github/serde/serde_derive)
  time: 0.022; rss: 57MB    parsing
  time: 0.000; rss: 57MB    attributes injection
  time: 0.000; rss: 57MB    recursion limit
  time: 0.000; rss: 57MB    crate injection
  time: 0.000; rss: 57MB    plugin loading
  time: 0.000; rss: 57MB    plugin registration
  time: 0.003; rss: 57MB    pre ast expansion lint checks
    time: 0.956; rss: 127MB expand crate
    time: 0.000; rss: 127MB check unused macros
  time: 0.956; rss: 127MB   expansion
  time: 0.000; rss: 127MB   maybe building test harness
  time: 0.002; rss: 127MB   maybe creating a macro crate
  time: 0.008; rss: 127MB   creating allocators
  time: 0.006; rss: 127MB   AST validation
  time: 0.101; rss: 140MB   name resolution
  time: 0.020; rss: 141MB   complete gated feature checking
  time: 0.045; rss: 148MB   lowering ast -> hir
  time: 0.018; rss: 148MB   early lint checks
  time: 0.067; rss: 153MB   indexing hir
  time: 0.000; rss: 153MB   load query result cache
  time: 0.000; rss: 153MB   looking for entry point
  time: 0.000; rss: 153MB   looking for plugin registrar
  time: 0.003; rss: 153MB   loop checking
  time: 0.003; rss: 155MB   attribute checking
  time: 0.032; rss: 171MB   stability checking
  time: 0.012; rss: 171MB   type collecting
  time: 0.000; rss: 171MB   outlives testing
  time: 0.000; rss: 171MB   impl wf inference
  time: 0.020; rss: 178MB   coherence checking
  time: 0.000; rss: 178MB   variance testing
  time: 0.038; rss: 179MB   wf checking
  time: 0.003; rss: 179MB   item-types checking
  time: 1.100; rss: 199MB   item-bodies checking
  time: 0.096; rss: 199MB   rvalue promotion
  time: 0.042; rss: 199MB   privacy checking
  time: 0.006; rss: 199MB   intrinsic checking
  time: 0.013; rss: 200MB   match checking
  time: 0.023; rss: 207MB   liveness checking
  time: 0.882; rss: 242MB   borrow checking
  time: 0.000; rss: 242MB   MIR borrow checking
  time: 0.000; rss: 242MB   dumping chalk-like clauses
  time: 0.000; rss: 242MB   MIR effect checking
  time: 0.007; rss: 242MB   death checking
  time: 0.002; rss: 242MB   unused lib feature checking
  time: 0.024; rss: 242MB   lint checking
  time: 0.000; rss: 242MB   resolving dependency formats
    time: 0.060; rss: 244MB write metadata
      time: 0.000; rss: 244MB   collecting roots
      time: 0.565; rss: 268MB   collecting mono items
    time: 0.565; rss: 268MB monomorphization collection
    time: 0.012; rss: 269MB codegen unit partitioning
    time: 0.000; rss: 269MB write allocator module
    time: 0.831; rss: 317MB codegen to LLVM IR
    time: 0.000; rss: 317MB assert dep graph
    time: 0.000; rss: 317MB serialize dep graph
  time: 1.508; rss: 317MB   codegen
    time: 0.513; rss: 331MB llvm function passes [serde_derive.2rvvgow4-cgu.0]
    time: 12.569; rss: 254MB    llvm module passes [serde_derive.2rvvgow4-cgu.0]
    time: 5.055; rss: 292MB codegen passes [serde_derive.2rvvgow4-cgu.0]
  time: 18.189; rss: 289MB  LLVM passes
  time: 0.000; rss: 290MB   serialize work products
    time: 0.053; rss: 290MB running linker
  time: 0.053; rss: 290MB   linking

looks like most time is actually spent by llvm optimizing the code.

gnzlbg commented 5 years ago

looks like most time is actually spent by llvm optimizing the code.

That's often a consequence of the crate generating a lot of LLVM-IR, which can happen is the crate is big, has a lot of "duplicate" code because of macros, etc.

dtolnay commented 5 years ago

I am going to close this issue, but compile time for serde, for serde_derive, and for code generated by serde_derive all continue to be important concerns. I am always happy to consider PRs that improve compile time -- same thing for syn and quote and proc-macro2 as well.