ESPRI-Mod / synda

ESGF Downloader (this is a deprecated repository, the tool has now moved to https://github.com/ESGF/esgf-download)
https://espri-mod.github.io/synda/
21 stars 11 forks source link

transfer log is opened on multiple file descriptors #191

Closed alaniwi closed 2 years ago

alaniwi commented 3 years ago

Below is the output of lsof examining the synda daemon process (with details of local user/group/path names edited out). The transfer.log is opened on 5 different file descriptors simultaneously. Could this be a problem?

username   27626 groupname    1u   REG   0,48 11164062785 2051578570 /path/to/sdt_3.10a/log/transfer.log
username   27626 groupname    2u   REG   0,48 11164062785 2051578570 /path/to/sdt_3.10a/log/transfer.log
username   27626 groupname    4u   REG   0,48 11164062785 2051578570 /path/to/sdt_3.10a/log/transfer.log
username   27626 groupname    5u   REG   0,48 11164062785 2051578570 /path/to/sdt_3.10a/log/transfer.log
username   27626 groupname    6w   REG   0,48   787915487 4199609270 /path/to/sdt_3.10a/log/discovery.log
username   27626 groupname    7w   REG   0,48 11164062785 2051578570 /path/to/sdt_3.10a/log/transfer.log
username   27626 groupname    8w   REG   0,48           0 2741302080 /path/to/sdt_3.10a/log/domain.log
username   27626 groupname   11ur  REG   0,48  5264580608 1319595084 /path/to/sdt_3.10a/db/sdt.db
username   27626 groupname   12u   REG   0,48        3072 2424717422 /path/to/sdt_3.10a/db/caches.db
pjournou-ipsl commented 2 years ago

Thanks for your message. Your observation is correct.

If we execute the code linked below, which only creates a logger, we can observe the same kind of result with the following lsof Linux command. $ lsof | grep "x.log" python3.9 8613 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8616 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8617 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8621 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8622 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8623 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8624 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8625 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8626 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8627 username 6w REG 253,3 0 1077431087 path_to/x.log python3.9 8613 8628 username 6w REG 253,3 0 1077431087 path_to/x.log

We can see a parent process (PID=8613) an a set of sub-processes (PID=8616, 8617, 8621, ...).
In fact, it's the logging package implementation that creates these threads.

In the documentation, we can read : "The logging module is intended to be thread-safe without any special work needing to be done by its clients. It achieves this though using threading locks; there is one lock to serialize access to the module’s shared data, and each handler also creates a lock to serialize access to its underlying I/O." (source : https://docs.python.org/3/library/logging.html)

If you execute the lsof command with the file name : lsof /path/to/sdt_3.10a/log/transfer.log, you will see a single line as a response (sub-processes are hidden)

In conclusion, we think that your use case is nominal and could not be really a problem. However, the way that synda implements the logs at the moment will change. We want to enhance this part as soon as possible.

Patrice

The code used for testing :

import logging

_log_format = "%(asctime)s - [%(levelname)s] - %(name)s - (%(filename)s).%(funcName)s(%(lineno)d) - %(message)s"

def get_file_handler(): file_handler = logging.FileHandler("x.log") file_handler.setLevel(logging.WARNING) file_handler.setFormatter(logging.Formatter(_log_format)) return file_handler

def get_logger(name): logger = logging.getLogger(name) logger.setLevel(logging.INFO) logger.addHandler(get_file_handler()) return logger

if name == 'main': logger = get_logger(name)