Decorator that allows logging function arguments

itamarst commented 5 years ago

@log_call
def f(x, y):
    return x * y

Should log action module.f with arguments x and y and success field result.

Extended form:

@log_call("action_name", omit_params=['z'], omit_return=True)
def f(x, y, z):
    pass

As a first pass self will not be logged.

jtrakk commented 5 years ago

I've been playing around with this idea, here's what I came up with.

import functools
import inspect
import os
import json
import sys

import attr
import eliot

def json_to_file(file=None):

    if file is None:
        file = sys.stdout

    def _json_to_file(x):
        print(json.dumps(x, default=to_serializable), file=file)

    return _json_to_file

eliot.add_destinations(json_to_file(sys.stdout))

def log_call(
    _func=None, omit_params=(), bind=None, omit_return=False, log_level="DEBUG"
):
    if bind is None:
        bind = {}

    if _func is None:
        return functools.partial(
            log_call,
            omit_params=omit_params,
            bind=bind,
            omit_return=omit_return,
            log_level=log_level,
        )

    def wrapper(func):

        function_data = dict(
            action_type=func.__name__,
            module=func.__module__,
            package=func.__module__.split(".", maxsplit=1)[0],
            file=inspect.getfile(func),
            argv=sys.argv,
            pid=os.getpid(),
        )

        @functools.wraps(func)
        def _log_call(*func_args, **func_kwargs):
            call_data = dict(args=func_args, kwargs=func_kwargs, **function_data)
            call_data = {k: v for k, v in call_data.items() if k not in omit_params}
            call_data.update(bind)

            with eliot.start_action(**call_data):
                result = func(*func_args, **func_kwargs)
                if not omit_return:
                    eliot.Message.log(message_type="return", log_level=log_level)
                return result

        return _log_call

    return wrapper(_func)

@functools.singledispatch
def to_serializable(obj):
    try:
        return attr.asdict(obj)
    except attr.NotAnAttrsClassError:
        return obj

# User code below

@log_call
def add(a, b):
    return a + b

@log_call
def product(*args):
    return functools.reduce(lambda x, y: x * y, args)

@log_call
def calculate(a, b):
    x = add(a, b)
    return product(a, b, x)

if __name__ == "__main__":
    answer = calculate(1, 2)
    try:
        answer = calculate(1.0, "a")
    except TypeError as e:
        pass

# $ python loggit.py | eliot-tree
# a494a2c1-d1e0-429b-a9cb-829e637f7979
# └── calculate/1 ⇒ started 2018-12-07 05:07:26 ⧖ 0.001s
#     ├── args:
#     │   ├── 0: 1
#     │   └── 1: 2
#     ├── argv:
#     │   └── 0: loggit.py
#     ├── file: loggit.py
#     ├── kwargs:
#     ├── module: __main__
#     ├── package: __main__
#     ├── pid: 20923
#     ├── add/2/1 ⇒ started 2018-12-07 05:07:26 ⧖ 0.000s
#     │   ├── args:
#     │   │   ├── 0: 1
#     │   │   └── 1: 2
#     │   ├── argv:
#     │   │   └── 0: loggit.py
#     │   ├── file: loggit.py
#     │   ├── kwargs:
#     │   ├── module: __main__
#     │   ├── package: __main__
#     │   ├── pid: 20923
#     │   ├── return/2/2 2018-12-07 05:07:26
#     │   │   └── log_level: DEBUG
#     │   └── add/2/3 ⇒ succeeded 2018-12-07 05:07:26
#     ├── product/3/1 ⇒ started 2018-12-07 05:07:26 ⧖ 0.000s
#     │   ├── args:
#     │   │   ├── 0: 1
#     │   │   ├── 1: 2
#     │   │   └── 2: 3
#     │   ├── argv:
#     │   │   └── 0: loggit.py
#     │   ├── file: loggit.py
#     │   ├── kwargs:
#     │   ├── module: __main__
#     │   ├── package: __main__
#     │   ├── pid: 20923
#     │   ├── return/3/2 2018-12-07 05:07:26
#     │   │   └── log_level: DEBUG
#     │   └── product/3/3 ⇒ succeeded 2018-12-07 05:07:26
#     ├── return/4 2018-12-07 05:07:26
#     │   └── log_level: DEBUG
#     └── calculate/5 ⇒ succeeded 2018-12-07 05:07:26

# 28890b81-6083-4f52-95ff-325e1db7e148
# └── calculate/1 ⇒ started 2018-12-07 05:07:26 ⧖ 0.000s
#     ├── args:
#     │   ├── 0: 1.0
#     │   └── 1: a
#     ├── argv:
#     │   └── 0: loggit.py
#     ├── file: loggit.py
#     ├── kwargs:
#     ├── module: __main__
#     ├── package: __main__
#     ├── pid: 20923
#     ├── add/2/1 ⇒ started 2018-12-07 05:07:26 ⧖ 0.000s
#     │   ├── args:
#     │   │   ├── 0: 1.0
#     │   │   └── 1: a
#     │   ├── argv:
#     │   │   └── 0: loggit.py
#     │   ├── file: loggit.py
#     │   ├── kwargs:
#     │   ├── module: __main__
#     │   ├── package: __main__
#     │   ├── pid: 20923
#     │   └── add/2/2 ⇒ failed 2018-12-07 05:07:26
#     │       ├── exception: builtins.TypeError
#     │       └── reason: unsupported operand type(s) for +: 'float' and 'str'
#     └── calculate/3 ⇒ failed 2018-12-07 05:07:26
#         ├── exception: builtins.TypeError
#         └── reason: unsupported operand type(s) for +: 'float' and 'str'

itamarst commented 5 years ago

Ah, neat, that's a nice trick for the decorator-that-is-also-a-decorator factory. Thank you, I might use some of this. (For results, BTW, action.add_success_field(result=result) is nicer since it's tied to the action instead of an independent message.)

itamarst commented 5 years ago

@jtrakk What are you using Eliot for, BTW?

jtrakk commented 5 years ago

add_success_field() is much nicer, thanks.

What are you using Eliot for, BTW?

So far, only fooling around to get a feel for it. I've got a few questions about usage which I'm still wondering about.

Is Eliot a good replacement for all logging?
Structlog's logging best practices recommends logging to stdout. Does that apply only to server-run applications, or also CLI and GUI applications? In a pipeable cli tool, does that mean switching to stderr instead? It seems like just logging to stderr by default means I could use the same code on a server as in a CLI pipeline.
How should I generate log messages that should only be produced during debugging, such as if the logging message is expensive to generate?
Should log levels be defined as a class LogLevel(enum.Enum)? Does Eliot have a recommended/included enum of levels to use?
How does the application programmer know whether someone should be awoken for a given error state? The application might change its significance in the business, such that an error in subroutine S of application A might seem critical at the time it's written, but become unimportant later, simply because that application becomes a less important part of the business. Inversely, the application might seem unimportant at first, but later become critical to the business. Should those changes in the importance of the application require changes to the application code? Or is there some way to specify in the environment that what the application considers a critical state should in fact be handled as a WakeMeInTheMiddleOfTheNight. How should I think about this?
Some places in the docs have message_type="my_module:my_function". I'm wondering how that compares to (package="my_package", module="my_module", function="my_function"). The latter seems like a more "structured" option which might be easier to query without manually string parsing, but I'm not sure. When should I prefer one of these over the other?
Should I prefer using prespecified MessageTypes over Message.log() when possible?
It seems many messages could be thought of as having multiple types, or inheritance structure, or tag system. I could query for items in the "user_registration" sequence, or function_calls, or function_calls in the "user_registration" sequence. Would it be reasonable to use message_types=[...] or tags=, instead of message_type?

Sorry, I had more questions than I'd realized! Obviously no need to respond to any/all of them, I'm happy just to have them typed up :-)

itamarst commented 5 years ago

There are situations when Eliot isn't the best logging library, but I think it's pretty good in many situations :) Occasionally there are features that logging has that it doesn't, but they can be added if necessary.
Structlog's best practices are for servers. For command line tools, I would say write to some known location (maybe piped through eliot-tree for easy reading) and then on crashes tell the user the path so they can upload that file (or better yet, give them option to automatically upload it, e.g. open a github issue for them if relevant).
Maybe if DEBUG:, where DEBUG is set by a command-line option?
Eliot doesn't have particular log levels by default. Note that enums won't get logged correctly at the moment, so you'll need to log the string level. I'm personally not particularly enthused about log levels for simple applications, so tend to omit them. For CLI in particular, it's not like you expect monitoring alerts to go off, so not sure it adds much.
It's hard to decide what constitutes "wake me up!". Thus for servers the approach is usually monitoring-oriented, which separates the decision of what to do from the decision of what to generate. You can still monitor logs.
message_type is required for Message.log(). If you omit it you'll get empty string as the message_type. (Similarly, action_type is required for actions). If you want you can also add package etc. I tend to not give package/module/function-specific names, but more high level "this is what this does", since refactoring moves stuff around occasionally.
MessageType is probably overkill for initial use.
Again, message_type is required. You may wish to have more sophisticated retrieval of messages someday, but quite possibly not. So I suggest just starting with message types and see how far that gets you.

I'll try to update documentation based on all the above, so should at least file new issues to that effect before closing this ticket.

jtrakk commented 5 years ago

Thanks for the comprehensive response!

One more, related to (6).

Is including the function_data section in my example a good idea? Setting the package, module, function name etc. automatically seems like it would provide the benefits of being maximally informative while avoiding the problems of refactoring that you mentioned. Since that part can be executed at import time, it doesn't seem like much of a performance issue either. I'm inclined to try this approach -- do you have any concerns about it?

jtrakk commented 5 years ago

On (2),

Structlog's best practices are for servers. For command line tools, I would say write to some known location (maybe piped through eliot-tree for easy reading) and then on crashes tell the user the path so they can upload that file (or better yet, give them option to automatically upload it, e.g. open a github issue for them if relevant).

Perhaps then the solution for a user-interactive CLI or GUI app is to write to either stderr or into the appdirs.user_log_dir() directory (via appdirs) depending on a switch or config.

itamarst commented 5 years ago

Writing logs to file by default, with flag to show them on stderr sounds good, yes.

For this particular decorator, yes, since it's tied to function including those automatically seems reasonable. For "where was this logged from", see #324.

itamarst commented 5 years ago

A failure mode I've seen in my prototype at work: change argument name, and now we're logging giant arrays. So maybe should be using whitelisting instead of blacklisting. Still a failure mode (missing log data), but logging should do no harm, and logging massive arrays to JSON has significant performance impact.

jtrakk commented 5 years ago

The omit_params list includes a string that is missing from the argspec? If that's the issue, perhaps a warning or pylint plugin might detect the problem early.

itamarst commented 5 years ago

Might be detectable that way, I suppose. Another alternative is "don't log overly large arrays".

jlumbroso commented 3 years ago

@jtrakk Your decorator is very nice code, thank you for sharing. Only comment is that omit_params is filtering the call_data dictionary, not the kwargs dictionary.

Edit: Here's my version, https://gist.github.com/jlumbroso/1b3736b13cbf55a6e7aa52eb741eced5#file-loguru_call_decorator-py

itamarst / eliot

Decorator that allows logging function arguments #338