Open bvssvni opened 7 years ago
Perhaps a better idea: Generate meta data in Rust code.
If you don't need the original source, you might as well include the meta data in the binary executable. This will skip the parsing step completely, while being compatible with all meta data converters in Rust.
Added new function for loading directly from meta data in https://github.com/PistonDevelopers/dyon/pull/432. Included an example that generates meta data in Rust code. It works!
Upside is that the binary becomes self contained, with no external code. Loads very fast, because there is no parsing. Skips lifetime and type checks by assuming the source already has been checked. This technique could be used when packaging apps.
Downside is that build times in release mode are scary: 1167.39 seconds for the n-body example. This should be straight forward code for the Rust compiler. I am wondering whether LLVM does some partial evaluation since it knows the input and this could be the reason it takes so long time.
$ time ./target/debug/examples/test
-0.16907514416856514
-0.16908671301860925
real 0m1.378s
user 0m1.368s
sys 0m0.009s
$ time ./target/release/examples/test
-0.16907514416856514
-0.16908671301860925
real 0m0.077s
user 0m0.069s
sys 0m0.006s
$ time ./target/debug/examples/test_compile
-0.16907514416856514
-0.16908671301860925
real 0m0.888s
user 0m0.827s
sys 0m0.014s
$ time ./target/release/examples/test_compile
-0.16907514416856514
-0.16908671301860925
real 0m0.051s
user 0m0.041s
sys 0m0.006s
Try -Z time-passes
flag for rustc to see where time is spent (e.g. export RUSTFLAGS="-Z time-passes"
). Feel free to also open an issue on the rustc. I assume that by “meta data” you mean some generated rust code; it might be hitting some super-linear code path somewhere and ideally none would exist; getting rid of it is difficult sometimes.
@nagisa Thanks! Running it now.
Here is what I got:
$ cargo run --release --example test_compile
Compiling dyon v0.16.0 (file:///Users/sven/rust/dyon)
time: 0.039 parsing
time: 0.000 recursion limit
time: 0.000 crate injection
time: 0.000 plugin loading
time: 0.000 plugin registration
time: 0.308 expansion
time: 0.000 maybe building test harness
time: 0.001 maybe creating a macro crate
time: 0.000 checking for inline asm in case the target doesn't support it
time: 0.001 complete gated feature checking
time: 0.005 early lint checks
time: 0.001 AST validation
time: 0.013 name resolution
time: 0.007 lowering ast -> hir
time: 0.001 indexing hir
time: 0.001 attribute checking
time: 0.000 language item collection
time: 0.001 lifetime resolution
time: 0.000 looking for entry point
time: 0.000 looking for plugin registrar
time: 0.003 region resolution
time: 0.001 loop checking
time: 0.000 static item recursion checking
time: 0.011 compute_incremental_hashes_map
time: 0.000 load_dep_graph
time: 0.001 stability index
time: 0.002 stability checking
time: 0.001 type collecting
time: 0.000 variance inference
time: 0.000 impl wf inference
time: 0.008 coherence checking
time: 0.000 wf checking
time: 0.001 item-types checking
time: 0.871 item-bodies checking
time: 0.000 drop-impl checking
time: 0.023 const checking
time: 0.002 privacy checking
time: 0.001 intrinsic checking
time: 0.001 effect checking
time: 0.001 match checking
time: 0.003 liveness checking
time: 0.020 rvalue checking
time: 0.498 MIR dump
time: 0.015 SimplifyCfg
time: 0.018 QualifyAndPromoteConstants
time: 0.020 TypeckMir
time: 0.000 SimplifyBranches
time: 0.008 SimplifyCfg
time: 0.062 MIR cleanup and validation
time: 0.121 borrow checking
time: 0.000 reachability checking
time: 0.001 death checking
time: 0.000 unused lib feature checking
time: 0.030 lint checking
time: 0.003 resolving dependency formats
time: 0.000 NoLandingPads
time: 0.010 SimplifyCfg
time: 0.010 EraseRegions
time: 0.003 AddCallGuards
time: 0.994 ElaborateDrops
time: 0.000 NoLandingPads
time: 0.023 SimplifyCfg
time: 0.005 InstCombine
time: 0.000 Deaggregator
time: 0.000 CopyPropagation
time: 0.006 SimplifyLocals
time: 0.003 AddCallGuards
time: 0.000 PreTrans
time: 1.054 MIR optimisations
time: 0.000 write metadata
time: 0.208 translation item collection
time: 0.002 codegen unit partitioning
time: 0.002 internalize symbols
time: 0.606 translation
time: 0.000 assert dep graph
time: 0.000 serialize dep graph
time: 34.105 llvm function passes [0]
time: 1215.066 llvm module passes [0]
time: 6.071 codegen passes [0]
time: 0.001 codegen passes [0]
time: 1255.267 LLVM passes
time: 0.000 serialize work products
time: 0.432 running linker
time: 0.439 linking
The "llvm module passes" part takes most of the time.
One intriguing idea I have: Knowing the input could trigger some optimization in LLVM, where it attempts to evaluate things, thereby essentially doing what the Dyon run-time is supposed to be doing: Executing the program, but at a much slower speed.
Opened https://github.com/rust-lang/rust/issues/39082 with updated to latest nightly.
Did some profiling with cmr's hprof library on a 582 loc file:
Over 98% of the time is spent parsing.
lib.rs:
Could figure out what's taking so much time in Piston-Meta.
Develop binary format for Piston-Meta
The meta data is all you need, while the syntax is just to make it readable. Sometimes you reuse the same data over and over, but want to load it as fast as possible.
Could develop an efficient binary format for meta data that loads much quicker. This format can also save space by using a string table. Meta data conversion could happen directly from binary data (changes required in Piston-Meta).
The benefit with this direction is that all meta formats automatically gets a binary format.
Add a
bin
type to DyonAlso been thinking about compression and privacy (simple encryption, for example for game content). Then it hit me that perhaps Dyon could get a built-in
bin
type. Could also use it as a bit vector.