exasol / integration-test-docker-environment

A docker-based environment for integration tests with the EXASOL DB.
https://exasol.github.io/integration-test-docker-environment/
MIT License
6 stars 2 forks source link

Add support of file logging for Luigi tasks #205

Closed tomuben closed 2 years ago

tomuben commented 2 years ago

Background

The standard logging for Luigi is to log to console only. This is bad when we want to store logging for CI builds, or when customers need to share logs.

Luigi already accepts Python log configuration files. We need to create this file on the fly, and pass to luigi.build, ex. luigi.build([task], workers=5, local_scheduler=True, logging_conf_file="luigi.conf")

However, the log can be configured only once. All subsequent log configs will be ignored.

As a workaround we propose to indicate a symlink in the log configuration, which points to the actual log file, given as parameter to common.run_task().

Acceptance Criteria

  1. Create a log configuration file on the file for Luigi and pass to luigi.build()
  2. Create a symlink which location is indicated by a new parameter of common.run_task()
  3. Set the symlink in the luigi configuration
  4. The symlink points to the actual log file
  5. The log file for the task is always under {task.get_log_path()}/exaslct.log
  6. The client is responsible for always indicating the same log file when invoking multiple times common.run_task(). Otherwise common.run_task() shall raise an exception!
tkilias commented 2 years ago

Idea regarding interface


global_log_symlink = None
LOG_FILE_NAME = "main.log"

class DynamicSymlink()
    def __init__():
        self.temporary_directory = create_temporary_directory()
        self.symlink_path = Path(self.temporary_directory, "log_symlink")

    def point_to(...)

    def get_symlink_path(...)

def ci():
    cli.invoke(command1,...)
    cli.invoke(command2,...)

def command1(...):
    if global_log_symlink is None
        global_log_symlink = DynamicSymlink()

    run_task(task_creator, log_symlink, options)

def command2(...):
  ...

def run_task(Callable, log_symlink, log_file_name, options):
    task=task_creator()
    log_path = task.get_log_path()+f"{LOG_FILE_NAME}.log"
    log_symlink.point_to(log_path)
    log_symlink_path = log_symlink.get_symlink()
    log_config_file = generate_log_config_file(log_symlink_path)
    no_scheduling_errors = luigi.build([task], log_config_file, options)
tkilias commented 2 years ago

Point_to with context manager, lock and serialization exception to avoid wrong usages

class DynamicSymlinkContextManager:

    def __init__(target_path, lock):
        ...

    def __getstate__():
        raise Exception("Serialization is not intended")

    def __enter()__:
        lock.acquire()
        create_symlink()

    def __exit__():
        remove_symlink()
        lock.release()

    def get_symlink_path():
        return target_path

class DynamicSymlink()
    def __init__():
        self.lock = Lock()
        self.temporary_directory = create_temporary_directory()
        self.symlink_path = Path(self.temporary_directory, "log_symlink")

    def point_to(target_path): DynamicSymlinkContextManager
        return DynamicSymlinkContextManager(target_path, self.lock)
tomuben commented 2 years ago

Implementation is not feasbible. The filehandle of the first configure-log invocation will point to the log-target, not to the symlink. New approach in #207.