Delgan / loguru

Python logging made (stupidly) simple
MIT License
19.98k stars 699 forks source link

mutliprocess / multithreading : all logs appear on sys.stdout sink, but some logs are trimmed in logfile sink #820

Open jose-atlin opened 1 year ago

jose-atlin commented 1 year ago

model.py

import random
import time

from loguru import logger

class FakeModel:
    def predict(self, thread_num: int, trace_id: str):
        sleep = random.randint(0, 4)
        logger.info(f"[{trace_id}-{thread_num}-3] Sleeping - start sleep={sleep} secs")
        time.sleep(sleep)
        logger.info(f"[{trace_id}-{thread_num}-4] Sleeping - end sleep={sleep} secs")
        return {}

fastapi_app.py

import random
import sys
import time
from threading import Thread
from typing import List

from fastapi import FastAPI

from app.model import FakeModel
from loguru import logger

app = FastAPI()
model = FakeModel()

# Setup logger sinks
logger.remove(0)

logger.add(
    sink=sys.stdout,
    level="INFO",
    format="{message} | {extra}",
    enqueue=True,
)

logger.add(
    sink=open("logfile.log", "w"),
    level="INFO",
    format="{message} | {extra}",
    enqueue=True,
)

def thread_predicts(thread_num: int, trace_id: str):
    with logger.contextualize(trace_id=trace_id, thread_num=thread_num):
        sleep = random.randint(0, 4)
        logger.info(f"[{trace_id}-{thread_num}-1] Sleeping - start sleep={sleep} secs")
        time.sleep(sleep)
        logger.info(f"[{trace_id}-{thread_num}-2] Sleeping - end sleep={sleep} secs")

        result = model.predict(thread_num, trace_id)

        sleep = random.randint(0, 4)
        logger.info(f"[{trace_id}-{thread_num}-5] Sleeping - start sleep={sleep} secs")
        time.sleep(sleep)
        logger.info(f"[{trace_id}-{thread_num}-6] Sleeping - end sleep={sleep} secs")
        return result

@app.get("/trace/{trace_id}")
def predict(trace_id: str):
    threads_: List[Thread] = []
    for thread_num in range(2):
        thread_ = Thread(target=thread_predicts, args=(thread_num, trace_id))
        threads_.append(thread_)

    for thread_ in threads_:
        thread_.start()

    for thread_ in threads_:
        thread_.join()

locustfile.py

import time
import uuid

from locust import HttpUser, between, task

class FakePerformanceTest(HttpUser):
    wait_time = between(1, 5)

    @task
    def test_fake_predict(self):
        for _ in range(5):
            trace_id = str(uuid.uuid4())
            self.client.get(f"/trace/{trace_id}", name="/trace")
            time.sleep(1)

There are times when logs for a thread is completely missing in logfile.log but is available in the sys.stdout. I dont know if the logs would be of any use..

This is how the logs came in logfile.log for a particular trace_id 51173d6d-0b43-4076-9ef0-daee310e21f6.

[51173d6d-0b43-4076-9ef0-daee310e21f6-0-2] Sleeping - end sleep=3 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-3] Sleeping - start sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}

These were generated in sys.stdout but was not available in logfile.log

[51173d6d-0b43-4076-9ef0-daee310e21f6-0-1] Sleeping - start sleep=3 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-1] Sleeping - start sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-2] Sleeping - end sleep=3 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-3] Sleeping - start sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-4] Sleeping - end sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-5] Sleeping - start sleep=2 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-2] Sleeping - end sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-3] Sleeping - start sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-4] Sleeping - end sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-5] Sleeping - start sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-6] Sleeping - end sleep=2 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-6] Sleeping - end sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}

Is there something i am missing. Thank you..

jose-atlin commented 1 year ago

In both these cases

Delgan commented 1 year ago

Hi @jose-atlin. Sorry for the late reply.

This is a recurring problem caused by the use of several workers, and therefore several processes. We need a way to synchronize the logger between processes, and I don't know if FastAPI allows that.

I definitely need to take a look at the possible workaround, but I'm not very familiar with FastAPI and Gunicorn.

In the meantime, maybe this blog post can help you: Unify Python logging for a Gunicorn/Uvicorn/FastAPI application.