mmtk / mmtk-core

Memory Management ToolKit
https://www.mmtk.io
Other
379 stars 69 forks source link

Show logs in release build, and stop disabling logs by level. #1222

Open wks opened 3 weeks ago

wks commented 3 weeks ago

TL;DR: We should do the following:

Status quo

MMTk core uses two crates for logging

log can control the level of logs to include at compile time, using features such as max_level_trace, release_max_level_off. Less important log statements will be eliminated at compile time by dead code elimination.

env_logger controls the printed log at run time. By setting RUST_LOG=insert_log_level_here, it can turn on log messages at least as important as insert_log_level_here, but turn off less important ones. Log statements turned off this will still involve a run-time check. In some cases, the arguments of log statements may still be evaluated even when it should not print.

No logging in release build

Currently, mmtk-core enables the features max_level_trace and release_max_level_off of the log crate. That means enabling all logs in debug build, and disable all logs in release build at compile time.

The purpose is obvious. We would like to minimize the run-time cost of logging in release build. Some code paths in mmtk-core are so hot that even an extra check may greatly reduce the performance. We simply turn logs off at compile time and eliminate them from the machine code.

But this approach has an obvious drawback. Some log messages are important for performance debugging, too. For example, the time it takes to do a GC, the kind of GC (nursery or full-heap), what triggered GC (an allocation in a certain space or manually by System.gc()), the heap size at the beginning and/or the end of a GC, etc. We need these kinds of information in release build, too. Currently, when we need such information, we have to manually modify Cargo.toml to enable the logging.

Enabling some logging in release build

We should have logging statements enabled at run time for at least the INFO level and more important. Currently, log messages of INFO level are printed at most once per GC. They include

Logging of this frequency should not affect the performance of the program. When we enable those log statements and turn them off at run time, the checking cost should be negligible. Yet those log messages provide important information for performance debugging.

More precise control of log scopes

log arguments evaluated even when disabled at run time

If a log level is included at compile time, but turned off at run time, things become a bit subtle. The arguments of a log statement can be evaluated even if it is not to be printed.

The following snippet is an excerpt from the log crate. Note that a log level is "less" if it is less verbose, i.e. more important. OFF < ERROR < WARN < INFO < DEBUG < TRACE.

    // log!(target: "my_target", Level::Info, "a {} event", "log");
    (target: $target:expr, $lvl:expr, $($arg:tt)+) => ({
        let lvl = $lvl;
        if lvl <= $crate::STATIC_MAX_LEVEL && lvl <= $crate::max_level() {
            $crate::__private_api::log(
                $crate::__private_api::format_args!($($arg)+),
                lvl,
                &($target, $crate::__private_api::module_path!(), $crate::__private_api::loc()),
                (),
            );
        }
    });

lvl <= $crate::STATIC_MAX_LEVEL compares against a constant and allows the compiler to do dead code elimination.

lvl <= $crate::max_level() compares against the globally most verbose level. For example, if we set RUST_LOG=error,foo=warn,bar=info,baz=debug, the log::max_level() will be DEBUG.

As long as the test if lvl <= $crate::STATIC_MAX_LEVEL && lvl <= $crate::max_level() passes, the log arguments will be evaluated, regardless of which module the log statement is in. For example, in module foo there is a statement:

mod foo {
  fn a() {
    log::debug!("Hello {}", panic!("You shouldn't see me"));
  }
}

If we set RUST_LOG=error,foo=warn,bar=info,baz=debug, the log::debug! above will still panic because some other modules have DEBUG level enabled.

log crate features affect user crates

The max_level_xxxx features will transitively affect all crates in a binary, conflicting with their log level settings.

If the child crate sets the max_level_info feature

log = { version = "*", features = ["max_level_info"] }

and the parent crate doesn't specify any features:

log = { version = "*" }

Then the max log level statically enabled will be info. log::debug! logs in the parent crate will not be executed.

If the parent crate specifies a different feature:

log = { version = "*", features = ["max_level_trace"] }

It will generate a compilation error while compiling the log module:

   Compiling log v0.4.22
error: multiple max_level_* features set
   --> /path/to/log-0.4.22/src/lib.rs:362:1
    |
362 | compile_error!("multiple max_level_* features set");

Therefore, the log crate explicitly recommend against using compile-time filters in libraries.

Create a special logging macro for performance-sensitive logs.

We may create a special logging macro that can be completely disabled by a Cargo feature. For example,

my_very_verbose_logging_module::trace!("Hello {}", 42);

It is expanded to nothing in normal builds, but can be expanded to log::trace! if we enable a Cargo feature cargo build --features hot_logs.

Then we will manually decide whether we call log::trace! directly or use my_very_verbose_logging_module::trace!. And we will enable all log levels at compile time for the log crate and depend on the hot_logs feature to control those verbose logs.

qinsoon commented 3 weeks ago

The current status is that 1. we provide best performance in release builds wtihout any logging, and 2. enabling logging for a release build requires changing Cargo.toml.

The goal is 1. we still want to provide equally good performance in release builds, and 2. enabling logging for a release build should be easier.

I think we can allow STATIC_MAX_LEVEL for release builds to be INFO, but the default logging level (max_level()) is none/off. So we shouldn't see performance difference, and we can enable logging at runtime.

wks commented 2 weeks ago

The features of the log crate that specify log levels at compile time will be inherited to crates that use mmtk-core, and will affect all crates in the compiled binary. If any user has a different log level configuration using the features of log, it will conflict with mmtk-core's setting. The documentation of the log crate explicitly mentioned not to use those features in libraries. https://docs.rs/log/latest/log/#compile-time-filters So we must not specify any such features.

-log = { version = "0.4", features = ["max_level_trace", "release_max_level_off"] }
+log = { version = "0.4" }

And it is necessary to introduce our own filtering mechanism. A custom wrapper macro should suffice.