rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.97k stars 12.68k forks source link

`-Ztime-passes`: only print passes that take more than 5 milliseconds #95444

Closed jyn514 closed 2 years ago

jyn514 commented 2 years ago

Right now, -Ztime-passes has very long output, most of which is completely useless for determining which part of your crate is slow:

``` time: 0.002; rss: 49MB -> 52MB ( +2MB) parse_crate time: 0.000; rss: 52MB -> 52MB ( +0MB) attributes_injection time: 0.000; rss: 52MB -> 52MB ( +0MB) incr_comp_prepare_session_directory time: 0.000; rss: 52MB -> 52MB ( +0MB) incr_comp_garbage_collect_session_directories time: 0.000; rss: 52MB -> 52MB ( +0MB) plugin_loading time: 0.000; rss: 52MB -> 52MB ( +0MB) plugin_registration time: 0.000; rss: 54MB -> 54MB ( +0MB) crate_injection time: 0.013; rss: 54MB -> 80MB ( +25MB) expand_crate time: 0.000; rss: 80MB -> 80MB ( +0MB) check_unused_macros time: 0.013; rss: 54MB -> 80MB ( +25MB) macro_expand_crate time: 0.000; rss: 80MB -> 80MB ( +0MB) maybe_building_test_harness time: 0.000; rss: 80MB -> 80MB ( +0MB) AST_validation time: 0.000; rss: 80MB -> 80MB ( +0MB) maybe_create_a_macro_crate time: 0.000; rss: 80MB -> 80MB ( +0MB) finalize_imports time: 0.000; rss: 80MB -> 80MB ( +0MB) resolve_access_levels time: 0.000; rss: 80MB -> 80MB ( +0MB) finalize_macro_resolutions time: 0.001; rss: 80MB -> 82MB ( +2MB) late_resolve_crate time: 0.000; rss: 82MB -> 82MB ( +0MB) resolve_main time: 0.000; rss: 82MB -> 82MB ( +0MB) resolve_check_unused time: 0.000; rss: 82MB -> 82MB ( +0MB) resolve_report_errors time: 0.000; rss: 82MB -> 82MB ( +0MB) resolve_postprocess time: 0.001; rss: 80MB -> 82MB ( +2MB) resolve_crate time: 0.000; rss: 82MB -> 82MB ( +0MB) complete_gated_feature_checking time: 0.015; rss: 52MB -> 82MB ( +30MB) configure_and_expand time: 0.000; rss: 82MB -> 82MB ( +0MB) prepare_outputs time: 0.000; rss: 82MB -> 82MB ( +0MB) blocked_on_dep_graph_loading time: 0.002; rss: 82MB -> 84MB ( +2MB) hir_lowering time: 0.000; rss: 84MB -> 84MB ( +0MB) early_lint_checks time: 0.000; rss: 84MB -> 84MB ( +0MB) drop_ast time: 0.000; rss: 84MB -> 84MB ( +0MB) setup_global_ctxt time: 0.000; rss: 87MB -> 87MB ( +0MB) looking_for_entry_point time: 0.000; rss: 87MB -> 87MB ( +0MB) looking_for_derive_registrar time: 0.000; rss: 90MB -> 90MB ( +0MB) unused_lib_feature_checking time: 0.002; rss: 87MB -> 90MB ( +4MB) misc_checking_1 time: 0.002; rss: 90MB -> 94MB ( +4MB) type_collecting time: 0.000; rss: 94MB -> 94MB ( +0MB) impl_wf_inference time: 0.000; rss: 94MB -> 94MB ( +0MB) unsafety_checking time: 0.013; rss: 94MB -> 116MB ( +22MB) coherence_checking time: 0.004; rss: 116MB -> 118MB ( +2MB) wf_checking time: 0.007; rss: 118MB -> 124MB ( +6MB) item_types_checking time: 0.013; rss: 124MB -> 127MB ( +3MB) item_bodies_checking time: 0.041; rss: 90MB -> 127MB ( +37MB) type_check_crate time: 0.000; rss: 127MB -> 127MB ( +0MB) match_checking time: 0.001; rss: 127MB -> 127MB ( +0MB) liveness_and_intrinsic_checking time: 0.001; rss: 127MB -> 127MB ( +0MB) misc_checking_2 time: 0.016; rss: 127MB -> 129MB ( +2MB) MIR_borrow_checking time: 0.000; rss: 129MB -> 129MB ( +0MB) MIR_effect_checking time: 0.000; rss: 129MB -> 129MB ( +0MB) layout_testing time: 0.005; rss: 129MB -> 130MB ( +1MB) crate_lints time: 0.000; rss: 130MB -> 130MB ( +0MB) module_lints time: 0.000; rss: 130MB -> 130MB ( +0MB) check_lint_expectations time: 0.006; rss: 129MB -> 130MB ( +1MB) lint_checking time: 0.001; rss: 130MB -> 130MB ( +0MB) privacy_checking_modules time: 0.008; rss: 129MB -> 130MB ( +1MB) misc_checking_3 time: 0.000; rss: 132MB -> 132MB ( +0MB) monomorphization_collector_root_collections time: 0.024; rss: 132MB -> 140MB ( +8MB) monomorphization_collector_graph_walk time: 0.003; rss: 140MB -> 141MB ( +1MB) partition_and_assert_distinct_symbols time: 0.033; rss: 130MB -> 141MB ( +11MB) generate_crate_metadata time: 0.000; rss: 141MB -> 141MB ( +0MB) find_cgu_reuse time: 0.055; rss: 141MB -> 162MB ( +21MB) codegen_to_LLVM_IR time: 0.060; rss: 141MB -> 162MB ( +21MB) codegen_crate time: 0.000; rss: 161MB -> 161MB ( +0MB) assert_dep_graph time: 0.000; rss: 161MB -> 161MB ( +0MB) check_dirty_clean time: 0.000; rss: 161MB -> 161MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::type_of) time: 0.000; rss: 161MB -> 162MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::generics_of) time: 0.000; rss: 162MB -> 162MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::predicates_of) time: 0.000; rss: 161MB -> 161MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::mir_const_qualif) time: 0.000; rss: 161MB -> 161MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::mir_for_ctfe) time: 0.000; rss: 161MB -> 161MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::optimized_mir) time: 0.000; rss: 161MB -> 160MB ( -1MB) encode_query_results_for(rustc_query_impl::queries::covered_code_regions) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::promoted_mir) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::adt_def) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::unsafety_check_result) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::thir_check_unsafety) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::typeck) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::diagnostic_only_typeck) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::used_trait_imports) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::mir_borrowck) time: 0.000; rss: 160MB -> 160MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::eval_to_allocation_raw) time: 0.000; rss: 160MB -> 159MB ( -1MB) encode_query_results_for(rustc_query_impl::queries::eval_to_const_value_raw) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::check_match) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::symbol_name) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::codegen_fn_attrs) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::codegen_fulfill_obligation) time: 0.001; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::specialization_graph_of) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::adt_drop_tys) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::adt_significant_drop_tys) time: 0.000; rss: 159MB -> 159MB ( +0MB) encode_query_results_for(rustc_query_impl::queries::unused_generic_params) time: 0.003; rss: 161MB -> 159MB ( -3MB) encode_query_results time: 0.004; rss: 161MB -> 159MB ( -3MB) incr_comp_serialize_result_cache time: 0.004; rss: 161MB -> 159MB ( -3MB) incr_comp_persist_result_cache time: 0.000; rss: 159MB -> 159MB ( +0MB) incr_comp_persist_dep_graph time: 0.004; rss: 161MB -> 159MB ( -3MB) serialize_dep_graph time: 0.003; rss: 159MB -> 115MB ( -44MB) free_global_ctxt time: 0.061; rss: 146MB -> 115MB ( -30MB) LLVM_passes(crate) time: 0.000; rss: 114MB -> 114MB ( +0MB) join_worker_thread time: 0.001; rss: 114MB -> 114MB ( +0MB) copy_all_cgu_workproducts_to_incr_comp_cache_dir time: 0.002; rss: 115MB -> 114MB ( -1MB) finish_ongoing_codegen time: 0.000; rss: 114MB -> 114MB ( +0MB) llvm_dump_timing_file time: 0.000; rss: 114MB -> 114MB ( +0MB) serialize_work_products time: 0.000; rss: 114MB -> 114MB ( +0MB) incr_comp_finalize_session_directory time: 0.000; rss: 114MB -> 114MB ( +0MB) link_binary_check_files_are_writeable time: 0.002; rss: 114MB -> 114MB ( +0MB) link_rlib time: 0.000; rss: 114MB -> 114MB ( +0MB) link_binary_remove_temps time: 0.003; rss: 114MB -> 114MB ( +0MB) link_binary time: 0.003; rss: 114MB -> 114MB ( +0MB) link_crate time: 0.005; rss: 115MB -> 114MB ( +0MB) link time: 0.196; rss: 39MB -> 95MB ( +56MB) total ```

It would be great to make self-profile smarter and only prints passes that take more than x milliseconds (.005 seconds seems like a reasonable cutoff). Then most lines would be trimmed but long running lints would still show up. It will still be possible to see the full information by running -Zself-profile, which is recommended for heavy-duty profiling anyway since it gives you more tools around the collected data.

Originally posted by @jyn514 in https://github.com/rust-lang/rust/issues/79953#issuecomment-743708396

jyn514 commented 2 years ago

Mentoring instructions: Change https://github.com/rust-lang/rust/blob/05d22212e89588e7c443cc6b9bc0e4e02fdfbc8d/compiler/rustc_data_structures/src/profiling.rs#L652 to only call print_time_passes_entry if start_time.elapsed() is longer than 5 ms.

@rustbot label: +E-easy +E-mentor

randomicon00 commented 2 years ago

@rustbot claim

randomicon00 commented 2 years ago

Hello @jyn514!

This is the first time I contribute to the rust project. I think I understand what is required to do, basically filtering out the passes and only keep the ones that take more than 5 milliseconds. Should I just add an if condition to the print function, and should the condition be hardcoded or configurable?

jyn514 commented 2 years ago

For now I think hardcoded is fine; eventually it would be nice to make it configurable, the same way it is for summarize.

cc @rylev , I think you've been doing a fair amount of performance work - do you have opinions?

rylev commented 2 years ago

I don't have strong opinions here, but I also don't use -Ztime-passes a lot.