istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
http://scrapy-cluster.readthedocs.io/
MIT License
1.18k stars 324 forks source link

LogFactory Callbacks #91

Closed madisonb closed 7 years ago

madisonb commented 7 years ago

It would be nice if we were able to set callbacks to provide additional custom functionality within an application that uses the LogFactory logger. I may want to integrate with other monitoring/hosted solutions or add additional logic within my application when an error is raised elsewhere.

I think this has two parts:

a) Add additional debugback, infoback, warnback, errorback, criticalback parameters to the LogFactory, that would be called if you use the applicable .info("message") function. b) Add an additional override within each traditional log level function, to provide the ability to not use the callback.

For example, maybe I want to use .error() whenever the code has a problem, but only actually use the callback provided in one place in my code. Expanding upon this further, it may even just be better to provide the callback as a function parameter, that way one .error() message may use a different callback than another .error() message.

This however may be annoying if you must specify the callback in every place in your code, perhaps the solution would also require a third point to the above.

c) Add an additional function parameter to specify a special override callback (overriding the default one). However, this conflicts a bit with B as both new parameters may be confusing.

If we only used C, our methods become like so:

def info(self, message, extra={}, callback=None):
    # stuff ...
    if callback is not None:
        callback(message, extra, 'INFO')

I actually like this idea the best, perhaps with a try/catch surrounding the callback as well.

madisonb commented 7 years ago

Some other options include the following:

d) Have a 'godback' callback function and use the extras/message to determine what you would like to do

def errback(message, extra):
    if extra.get('something') == 'val':
        # This is a kafka error, we handle this special case somewhere else so don't do it here

e) Have a call registration system where portions of the code can register callbacks when the extra_match is a subset of of extra field, like so:

logger = LogFactory.get_instance(json=True,
                                 stdout=False,
                                 name=traptor_name,
                                 level=log_level,
                                 dir=log_dir,
                                 file=log_file_name)

logger.register_errback(kafka_error, extra_match={
    'error_type': 'KafkaUnavailableError',
})

def kafka_error(msg):
    print("There was a kafka error! ", msg)

I don't know how this plays into the singleton factory design, or what happens when different threads or portions of the same codebase register functions potentially in different scopes, but it is an interesting idea regardless.

madisonb commented 7 years ago

PR addresses the issue raised here. Closing.