mmtk / mmtk-core

Memory Management ToolKit
https://www.mmtk.io
Other
379 stars 69 forks source link

Use wrapper for logging. #1225

Open wks opened 2 weeks ago

wks commented 2 weeks ago

We no longer use the max_level_trace and release_max_level_off features of the log crate. By doing this, mmtk-core, as a library, won't conflict with its users which need to use the log level features of the log crate.

We no longer directly use logging macros in the log crate. Now all logging operations are done through wrapper macros in the crate::util::log module. Logs of level DEBUG and TRACE are disabled at compile time for release builds, and can be enabled using the "hot_log" Cargo feature. By doing this, we can disable logging statements on hot paths in mmtk-core, only, without affecting log statements in other crates. It also allow developers to use logs in release builds in order to debug the performance.

TODO:

k-sareen commented 2 weeks ago

In my opinion, the default for release build should be no logs. But we should add a cargo feature that allows info-level logs (and above) for release builds. Also "hot_log" is not a very descriptive feature name. Maybe it should be "release_hot_path_logging" or "release_max_level_trace" to be similar to the log crate feature.

EDIT:

In my opinion, the default for release build should be no logs.

Reason for this being that we don't want to break expectations of runtimes such as OpenJDK etc.

wks commented 2 weeks ago

In my opinion, the default for release build should be no logs.

Actually the right way to do this is not bundling env_logger, and letting the mmtk-openjdk binding select the logging implementation. Since the binding selects the log implementation, it will be aware of which log level to print. We may still provide env_logger as an optional dependency, and may even be a default dependency so that first-time mmtk-core developer can still see some logs.

Alternatively we can add a section in the tutorial and let the binding depend on env_logger and initialize the logger on start-up. This is the right way to use loggers in Rust anyway.

qinsoon commented 2 weeks ago

Actually the right way to do this is not bundling env_logger, and letting the mmtk-openjdk binding select the logging implementation. Since the binding selects the log implementation, it will be aware of which log level to print.

We made it clear here that if the binding wants to initialize its logger, they should do it before initializing MMTk. In that case, MMTk will fail to set up a logger and use the logger from the binding. Otherwise MMTk will initialize a logger itself: https://github.com/mmtk/mmtk-core/blob/3830168b884e922ee354b6f19845fb6eec89bf26/src/memory_manager.rs#L41

This is a resonable choice. For MMTk, most VMs are not written in Rust, and they do not have their own Rust logger set up. If every binding needs to set up the same logger, why not set it up in mmtk-core? Also we allows the binding to set up a logger if they want to.

qinsoon commented 2 weeks ago

I dont see why we need custom logging functions. I prefer using Rust's standard logging. If the binding wants to control the logging, we should allow it. If they don't, we use the logger we initialize and control the logging. If our current implementation uses features to disable logging in release builds, or to control logging levels which cannot be changed by the binding, we can fix that.

qinsoon commented 2 weeks ago

One potential benefit of using custom logging functions is that later we can forward the logging calls to the binding so if necessary, the binding can use the VM logging facilities and MMTk logs will be part of the VM logs. Other than this, I can't think of why we want to deviate from Rust standard logging.

k-sareen commented 2 weeks ago

This is a resonable choice. For MMTk, most VMs are not written in Rust, and they do not have their own Rust logger set up. If every binding needs to set up the same logger, why not set it up in mmtk-core? Also we allows the binding to set up a logger if they want to.

I think Kunshan's point was that for cases like OpenJDK or ART, they don't print logs to stderr but print to a log file or the logcat buffer in ART's case. So it may be the case that mmtk-openjdk or mmtk-art will want to change the logging implementation (and indeed I already use the android_log crate).

But I agree with your general sentiment. It should be lowest overhead to get it to work imo. env_logger is a good out-of-the-box solution.

wks commented 2 weeks ago

... For MMTk, most VMs are not written in Rust, and they do not have their own Rust logger set up. If every binding needs to set up the same logger, why not set it up in mmtk-core? Also we allows the binding to set up a logger if they want to.

Like @k-sareen said, some VMs do have their own logging framework. OpenJDK, for example, has its own unified logging framework described in https://openjdk.org/jeps/158, and has a category for "gc" which can be enabled by -Xlog:gc or set with more details like -Xlog:gc=trace:file=gctrace.txt. Customizing the logger will allow the mmtk-openjdk binding to integrate our logs into their logging framework.

Anyway, I don't intend to remove the env_logger we provide by default. I opened another PR https://github.com/mmtk/mmtk-core/pull/1226 for making env_logger an optional dependency, but still enabled by default.

I dont see why we need custom logging functions.

There are several reasons.

One is that the de-facto standard logging facade log has its limitations. At compile time, it can enable/disable logs by level, but not by scope. At run time, it can only filter by the max log level. See this issue, section "log arguments evaluated even when disabled at run time". By introducing our own wrappers, we can disable MMTk-related logs at compile time without affecting other crates. For example, by disabling the hot_log feature, the user can enable the trace log level for their own Rust programs while the MMTk GC still runs at full speed without debug! or trace! in the hot paths. One example of Rust programs in VMs is Ruby's YJIT compiler. It currently uses its own Log implementation, but MMTk shouldn't prevent it from using the log crate as a facade.

The other problem is for engineering. There are other more advanced logging interfaces and implementations in Rust, including slog and tracing. If we want to make a project-wide change, and all logging operations already go through our own wrapper, we can only change that wrapper and quickly adapt to a new logging framework.

qinsoon commented 2 weeks ago

Remove (release_)maxlevel*

The log crate states that libraries should not use max_level_* and release_max_level_*. If we/MMTk specifies max_level_*/release_max_level_*, the binding cannot override it. The only way I can find is to set those features as our default features so the binding can disable our default features, which seems awkward anyway.

So it seems obvious that we should not use max_level_* and release_max_level_* in mmtk-core. I think we can agree on this easily.

hot_log and custom logging

This PR introduces hot_log and custom logging functions in order to control trace/debug level logging in a release build. I guess the intention is to avoid introducing overhead for the log level check in a release build. However, this may be unnecessary.

We are uncertain about the performance impact of the log-level check. I tried with a few benchmarks with OpenJDK (where fastpaths are implemented in the binding side with no logging). I compared release_max_level_off and release_max_level_trace with OpenJDK and RUST_LOG=off, and there was no difference. We should run all the benchmarks to verify the point. But it is possible that allowing log-level checks on release builds is fine for an efficient runtime where the fastpaths are all implemented on the VM side (no Rust logging).

Based on the log crate, controlling log levels ((release_)max_level_*) should be in the user/binding side. We probably shouldn't work around the issue by introducing our own feature and logging methods to gain the control of logging levels in mmtk-core (at the library side). Instead, we can document that we recommend users to use release_max_level_info to avoid possbible overhead.