Add logger for efficient and convenient logging of objects

dbt-labs / metricflow

MetricFlow allows you to define, build, and maintain metrics in code.

https://docs.getdbt.com/docs/build/about-metricflow

Other

1.12k stars 92 forks source link

Add logger for efficient and convenient logging of objects #1343

Open plypaul opened 1 month ago

plypaul commented 1 month ago

From the docstring:

In MF, there is a need to log complex objects like DAGs and paths for debugging. Previously, the pattern for logging those objects used a pretty-formatting method like:

    logger = logging.getLogger(__name__)
    ...
    logger.debug(mf_pformat_many("Found matching path.", {"matching_path": matching_path, "source_nodes": source_nodes}))

    ->

    Found matching path.
      matching_path: [1, 2, 3]
      source_nodes: ["a", "b", "c"]

However, there were a few issues:

The mf_pformat_many() call was evaluated regardless of whether the given logging level was enabled. The formatting of complex objects was found to be significant in performance profiling.
Since the message is the only object passed to the logger, it was not reasonable to convert the objects to a different form (e.g. JSON) for different backends.
Logging calls were verbose, and string keys like "matching_path" were not amenable to auto-complete in the IDE.

The class is used in a new approach that uses a custom logging class that takes keyword arguments that are only converted to strings when the corresponding logging level is enabled. The logger is created via a configurable factory that can be used to implement different ways of logging those objects (e.g. JSON).

logger = mf_get_logger()
...
logger.debug("Found matching path.", matching_path=matching_path, source_nodes=source_nodes)

A signature for the logging calls that allows non-named arguments and uses the variable names as keys in the output was considered for reduced verbosity:

logger.debug("Found matching path.", matching_path, source_nodes)

This approach was prototyped using the varname library to resolve the variable name, but it was found to be too slow (~0.7s for 1000 calls).

github-actions[bot] commented 1 month ago

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

github-actions[bot] commented 1 month ago

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

plypaul commented 1 month ago

Why do we need to re-implement Python's logger interface here?

We don't have to. The main issues I was trying to resolve are listed in the PR description. I'm up for any solution that can address those. It sounds like you have something else in mind - see below.

Did you consider, instead, making the formatter an object with overrides for the str and repr methods that would produce the formatted output? That obviates the need for all this error handling (which is not without its costs) and fallback logic, it eliminates the logging factory, and it allows everybody to use the Python logger module as normal and also take advantage of a LazyFormatter or whatever in contexts where they might not, in fact, need it for logging. We might even consider using a subclass of some of Python's formatting classes for this.

Can you elaborate on the mechanism / function interface for this formatter? I considered a something like a lazy proxy, but couldn't think of a way to make it work well.

plypaul commented 1 month ago

I'm requesting changes here because I really don't want us to keep building custom things that are better solved with more standard implementations. If there's a compelling reason to do all of this wrapping and log redirection we should discuss it, but I don't see one described in this PR.

To confirm, you read the PR description?

tlento commented 1 month ago

To confirm, you read the PR description?

Yes, I did, and my contention is that there is nothing in there to justify overriding the built in logger mechanisms with a custom thing that may not, in fact, conform to reasonable and widespread user expectations of how the logger mechanisms work.

There are a number of options but I think the two most reasonable are:

make a lazy formatter class that does the work when str() is called on it
add a custom LogHandler and configure all of our loggers to do something fancy when it makes the LogRecord

I haven't looked too deeply into option 2 but it doesn't seem worth the effort. Instead, I recommend option 1 and adding a class wrapper for the format operations and deferring formatting through __str__ overrides as recommended in the documentation. You basically just delegate the formatting into __str__, and then any call to str() will trigger the format operation.

It's worth noting that the same rule applies to the format args passed into the log call - the logger uses printf-style argument substitution, which means any %s directive will cast the corresponding argument to a string by calling str() on it. In the context of a runtime log call you could then do this, using the standard Python logger:

logger.info("Dataflow plan is:\n%s", LazyFormatter(dataflow_plan))

Or:

dataflow_plan = LazyFormatter(dataflow_plan)
optimized_plan = LazyFormatter(optimized_plan)
logger.info(
    "Dataflow plan is:\n\n" +
    "%(dataflow_plan)s" + "\n\n" +
    "Optimized plan is:\n\n" +
    "%(optimized_plan)s",
    {'dataflow_plan': dataflow_plan, 'optimized_plan': optimized_plan}
)

This solves all of the problems laid out in the PR description. Instead of overriding the entire logging framework with something that will almost certainly perform worse, we simply improve our formatter to allow for lazy evaluation and swap from f-strings to %s style format strings.

That migration is, I admit, annoying. But it avoids all of the debt associated with taking a common, widely used construct like the python logger and wrapping it something that maybe works the same way. I personally have no way of even evaluating how these log calls get handled if someone, say, applies a logging config to filter most of them out. Worth noting, that's something that can happen outside of MetricFlow, because the container application expects to be able to control these behaviors and they might get quite cranky if we're calling all these log methods and doing all this delegation and possibly even doing added exception handling when they thought they'd removed all of that logic from their runtime.

tlento commented 1 month ago

Finally, for cases where we are not currently using format strings, we can go from this:

        logger.info(
            mf_pformat_many(
                description="Input conversion result for query():",
                obj_dict={
                    "metrics": metrics,
                    "group_by": group_by,
                    "order_by": order_by,
                },
            )
        )

To this:

        logger.info(
            mf_pformat_lazy(
                description="Input conversion result for query():",
                metrics=metrics,
                group_by=group_by,
                order_by=order_by,
                },
            )
        )

No logger framework overrides needed, all we require is mf_pformat_lazy to return the appropriately configured lazy formatter object.

plypaul commented 1 month ago

I haven't looked too deeply into option 2 but it doesn't seem worth the effort. Instead, I recommend option 1 and adding a class wrapper for the format operations and deferring formatting through str overrides as recommended in the documentation. You basically just delegate the formatting into str, and then any call to str() will trigger the format operation.

Do you mean something like logger.info(LazyFormatter(...))?

Instead of overriding the entire logging framework with something that will almost certainly perform worse

It sounds like you see something that will impact performance. What is it?

I personally have no way of even evaluating how these log calls get handled if someone, say, applies a logging config to filter most of them out.

It would make no difference. The class implemented here is a wrapper that just formats strings before passing them to the standard logger.

logger.info("Dataflow plan is:\n%s", LazyFormatter(dataflow_plan))

This solves all of the problems laid out in the PR description.

How does this resolve ability to output the object as JSON?

Regarding your approach, you're saying that your preference is to migrate to:

dataflow_plan = LazyFormatter(dataflow_plan)
optimized_plan = LazyFormatter(optimized_plan)
logger.info(
    "Dataflow plan is:\n\n" +
    "%(dataflow_plan)s" + "\n\n" +
    "Optimized plan is:\n\n" +
    "%(optimized_plan)s",
    {'dataflow_plan': dataflow_plan, 'optimized_plan': optimized_plan}
)

Instead of the proposed interface:

logger.info(
    "Finished optimizing plan.", 
    original_dataflow_plan=dataflow_plan, 
    optimized_plan=optimized_plan
)

due to the desire to stick with the standard interface of the Logger class?

tlento commented 1 month ago

It sounds like you see something that will impact performance. What is it?

I mean more on the end user (i.e., developer) side, not runtime performance, but there is an issue with us adding yet another thread safety lock call in here. Wouldn't we be better off not doing that and delegating thread safety for logging to the object built and designed to manage it?

Overall, my desire is to avoid putting object shims and custom implementations of totally standard things in the way of every developer who has to use these things on a daily basis. You've created a wrapper shim around the entire logging interface that is a glorified message formatter, and we now have to contend with it.... when? Always? Only when we call mf_pformat()? I'm not sure.

Consequently, I'm proposing we make that distinction clear instead of clobbering most of the direct access functionality Python's logger affords us and hiding it behind a shim. In practice, I'd expect an interface like this:

logger.info(
    LazyFormatter(
        "Finished optimizing plan.", 
        original_dataflow_plan=dataflow_plan, 
        optimized_plan=optimized_plan,
    )
)

With the user having the option of doing something like this:

logger.info("Dataflow plan is:\n%s", LazyFormatter(dataflow_plan))

What would I do when I want to emit a log as JSON (which, I'll note, the PR description says we don't want to do here, and frankly I'm not sure how this change matters for that), or alter the formatting in some other way, or add extras to a particular log call, or anything else the Python logger interface supports? I would do what one does when trying to manipulate a given Python LogRecord object - use Filter and Formatter and possibly LogHandler objects to do it, if necessary, or just pass in the supported configuration parameters and arguments to various methods to make it happen.

The interface you've added doesn't give me the access I expect for those things, even though the callsites look the same. I have to wade through this wrapper, or maybe write a custom class to do handling with a different log factory. It is in fact unclear to me what I'm expected to do in these cases.

plypaul commented 1 month ago

What would I do when I want to emit a log as JSON (which, I'll note, the PR description says we don't want to do here, and frankly I'm not sure how this change matters for that), or alter the formatting in some other way, or add extras to a particular log call, or anything else the Python logger interface supports?

To make sure that we're talking about the same thing - when I say output the object as JSON, the use case (not mine) that this is trying to cater to is that whenever an object is logged, a JSON representation of the object is included via the extra field in the logging call. To use your example:

logger.info(
    LazyFormatter(
        "Finished optimizing plan.", 
        original_dataflow_plan=dataflow_plan, 
        optimized_plan=optimized_plan,
    ),
    extra={
        "original_dataflow_plan": dataflow_plan.json(), 
        "optimized_plan"=optimized_plan.json()
    }
)

When I hear emit a log as JSON, that means to write the log record as a JSON object. That would be handled by a JSON formatter that's attached to the standard Python logger and it would not be different for the proposed case.

The interface you've added doesn't give me the access I expect for those things, even though the callsites look the same.

If the similarity is the issue, then the interface can be changed / renamed (e.g. ObjectLoggingHelper) but I suspect that's not your concern. Your concern is more to do with a layer between the .info line in MF code and the standard-library Python logger. Is that right?

plypaul commented 1 month ago

I mean more on the end user (i.e., developer) side, not runtime performance, but there is an issue with us adding yet another thread safety lock call in here. Wouldn't we be better off not doing that and delegating thread safety for logging to the object built and designed to manage it?

That lock was added as a defensive measure against bugs in custom implementations and can be removed.

Overall, my desire is to avoid putting object shims and custom implementations of totally standard things in the way of every developer who has to use these things on a daily basis.

I am a developer who works with this code base heavily on a daily basis, and I am tired of writing verbose log lines. My desire is to simplify that with some convenience measures.

You've created a wrapper shim around the entire logging interface that is a glorified message formatter, and we now have to contend with it.... when? Always? Only when we call mf_pformat()? I'm not sure.

If you're not sure about that, I don't think you understand what's in here. I don't see howmf_format() is related?

tlento commented 1 month ago

Let's focus on the problem I have with this PR:

We appear to be reinventing a bunch of stuff in the name of small conveniences. When we've done this in the past it has burned us over and over and over again. Maybe I'm over-rotating on this because I'm sick of it, but I also don't see much benefit here. The small convenience, in this case, is a single call to a LazyFormatter object instead of it being hidden behind our wrapped_logger.info() call. So this:

# metricflow
logger.info(LazyFormatter("I will be formatted!", hello=hello_obj))

#server side
logger.info(ExtrasLazyFormatter("I will be formatted with extras!", hello=hello_obj))

vs this:

logger.info("I will be formatted!", hello=hello_obj)

Something to point out about the more streamlined interface - some of those kwargs are reserved. For example, you can't log an extra kwarg as part of the pretty formatter call, even if you really want to. So we need to make sure to take account of that if we stick with the approach in this PR (and if we haven't already - I read through the first commit and stopped).

Now, getting to the the server side call with just the formatter object takes some doing, and it's a bit hacky, but a much safer and more limited scope form of hacky than what we have in this PR. What we need to do depends quite a bit on what's required on the server side as well. So, which of these do we need?

We need to call logger.info() with an extra parameter on the server side only for server side logging
We need to magically make all of the logger calls in MetricFlow conform to the format expected over there
We need both of these (in which case we don't need an ExtrasLazyFormatter, we can just use one LazyFormatter)

plypaul commented 1 month ago

I have a few issues with the comments / points, and in the interest of reducing the back and forth, let's chat about it in our sync today.

tlento commented 1 month ago

First of all, I want to make a public apology to @plypaul for the tone and tenor of my comments. I was not in a good mood, but that's no excuse and I try to be better than that.

Per our discussion, two other approaches we're considering are:

a single input to allow for flexible formatting
a make_extras method to allow for unified formatting of extras across open source and the server side

Option 1 might be something like:

class MFLogComponents(NamedDict):
    """ Container for log message components. This could be formattable itself to allow direct logging."""
    msg: Any
    extra: Dict[str, Any]. # values are JSON serializable inputs

def mf_log_msg(msg: Any, **extras) -> MFLogComponents:
    # do stuff here to make these all formattable
    return MFLogComponents(msg=msg, extra=extras)

# To log with extras:
logger.info(**mf_log_msg("I'm logging an object!", my_object=some_object)

# To log as a formatted string without extras:
logger.info(mf_log_msg("I'm logging an object inlined in the message!", my_object=some_object))

Option 2 is simpler, but it's most useful if we decide to always follow this pattern of logging a simple string with all objects in extras:

def make_extras(**kwargs) -> Dict[str, Any]:
    """Makes an extras dict for logging. All values will be JSON serializable."""
    # Do stuff to make a dict with json serializable values, as per whatever spec you like.
    return serializable_extras

# To log with extras:
logger.info("I'm logging an object!", extra=make_extras(my_obj=some_obj)

Option 2 does not directly support logging the extras inline in the message as well as in extras only - that would have to be done via a Formatter class. But it means we can just use all of the standard stuff.

tlento commented 1 month ago

@plypaul there are two other options I just landed on while writing that comment. These only really make sense if we're going to always use extra in the log call.

We can create a custom LoggerAdapter to do the relevant wrangling of the input kwargs inside of an overridden process method. You basically get the interface from this PR and have a layer of indirection for the cases where you want this object redirection. This isn't really what the LoggerAdapter is for, though, so I'm not super thrilled about it, but it has the advantage of letting you log object-formatted stuff when you want, on a callsite-by-callsite basis, and not fuss with it when you don't.
Subclass the base Logger class and override the **kwargs handling on the main input interface methods to build extras first and then pass everything else along. By default Python's Logger does not accept kwargs that aren't exc_info, stack_info, stacklevel, and extra, so we can just conform the inputs to that expectation as appropriate.

I'm a bit skittish about overrides on the Logger base class because I don't fully understand everything it's doing under the covers, but if they're just re-organizing keyword arguments and then passing those through to the superclass method it's probably safe enough.