agronholm / apscheduler

Task scheduling library for Python
MIT License
5.93k stars 690 forks source link

Objects inside the job are instantiated multiple times #884

Closed JahnaviBhide closed 3 months ago

JahnaviBhide commented 3 months ago

Things to check first

Version

3.10.4

What happened?

Blocking scheduler creating multiple instances of objects defined in job

I created a simple scheduler

pyscheduler.py

from import apscheduler.schedulers.background import BlockingScheduler
import logging 
import test

try :
    logger = logging.getLogger('apscheduler')
    logger.setLevel(logging.INFO)
    file_handler = logging.FileHandler('scheduler.log',mode = 'a',encoding='utf-8')
    logger.addHandler(file_handler)
    scheduler = BlockingScheduler() 

    scheduler.add_job(test.fun, 'interval', minutes = 1 )

    scheduler.start()

test.py 
def fun():
    logger = logging.getLogger('test')
    logger.setLevel(logging.INFO)
    file_handler = logging.FileHandler('test.log',mode = 'a',encoding='utf-8')
    logger.addHandler(file_handler)
    logger.info('From test')

At every execution (per minute) this is what happens :

execution 1 : from test

execution 2 (2nd minute) : from test from test

execution 3 :

from test from test from test

The logs are just instantiated and called multiple times. If i pass logger object from pyscheduler.py to test.py it works fine.

Same issues happens if I am initialising Oracle client directory in test.py using

cx_Oracle.init_oracle_client(config_dir="/home/your_username/oracle/your_config_dir")

I get an error that says client library already initialised .

How can we reproduce the bug?

pyscheduler.py

from import apscheduler.schedulers.background import BlockingScheduler
import logging 
import test

try :
    logger = logging.getLogger('apscheduler')
    logger.setLevel(logging.INFO)
    file_handler = logging.FileHandler('scheduler.log',mode = 'a',encoding='utf-8')
    logger.addHandler(file_handler)
    scheduler = BlockingScheduler() 

    scheduler.add_job(test.fun, 'interval', minutes = 1 )

    scheduler.start()

test.py 
def fun():
    logger = logging.getLogger('test')
    logger.setLevel(logging.INFO)
    file_handler = logging.FileHandler('test.log',mode = 'a',encoding='utf-8')
    logger.addHandler(file_handler)
    logger.info('From test')

At every execution (per minute) this is what happens :

execution 1 : from test

execution 2 (2nd minute) : from test from test

execution 3 :

from test from test from test

peterschutt commented 3 months ago

test.fun() is called every minute - as you've configured it. Every time the func is called you stack another handler into the logger that writes to the file (loggers can have multiple handlers)

def fun():
    logger = logging.getLogger('test')  # gets ref to same logger every execution - not a new logger every time
    logger.setLevel(logging.INFO)
    file_handler = logging.FileHandler('test.log',mode = 'a',encoding='utf-8')  # creates a new file handler every execution that points to same file
    logger.addHandler(file_handler)  # adds another handler to the logger
    logger.info('From test')

Configure your logging outside the test function:

logger = logging.getLogger('test')  
logger.setLevel(logging.INFO)
file_handler = logging.FileHandler('test.log',mode = 'a',encoding='utf-8')
logger.addHandler(file_handler)

def fun():
    logger.info('From test')
agronholm commented 3 months ago

As he said above. Closing.