agronholm / apscheduler

Task scheduling library for Python
MIT License
6.31k stars 713 forks source link

Add .get_task method to Schedulers - APS v4 #953

Open HK-Mattew opened 3 months ago

HK-Mattew commented 3 months ago

Things to check first

Feature description

Hello,

My suggestion is to add the .get_task(task_id=...) method to the Schedulers.

Use case

I found myself in a situation where I needed to pass the Task instance directly to the .add_job method to get an existing task configuration.

I could use the method to get all tasks with .get_tasks. But I would have to filter this list every time to get a single specific task. I don't think this would be a very interesting approach in my use case and I believe my suggestion will be useful to others as well.

agronholm commented 3 months ago

I'll consider this, but I'm curious as to why you would need to pass a Task instance to add_job(). Could you explain that?

HK-Mattew commented 3 months ago

I'll consider this, but I'm curious as to why you would need to pass a Task instance to add_job(). Could you explain that?

Because whenever I use the .add_job method, the .add_job method itself uses the .configure_task method internally and if I pass the task id to the .add_job method, it overwrites some configurations that I made previously in the task.

However, passing the Task instance does not overwrite my configuration.

I did not report this as a bug, because I am not sure if this is a bug or if it is actually expected behavior.

agronholm commented 3 months ago

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

HK-Mattew commented 3 months ago

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

Good to know,

I'll reproduce this now

HK-Mattew commented 3 months ago

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

Here is the sample code:

from apscheduler import Scheduler, SchedulerRole
from apscheduler.executors.async_ import AsyncJobExecutor
from apscheduler.executors.thread import ThreadPoolJobExecutor
from apscheduler.executors.subprocess import ProcessPoolJobExecutor
from apscheduler.datastores.mongodb import MongoDBDataStore
import config

scheduler_web_configs = dict(
    data_store=MongoDBDataStore(
        client_or_uri=config.MONGO_DB_URI,
        database=config.MONGO_DB_NAME
    ),
    role=SchedulerRole.scheduler,
    max_concurrent_jobs=100,
    job_executors={
        'async': AsyncJobExecutor(),
        'threadpool': ThreadPoolJobExecutor(),
        'processpool': ProcessPoolJobExecutor(),
    }
)

def func_to_task_1():
    ...

with Scheduler(**scheduler_web_configs) as scheduler:

    scheduler.configure_task(
        func_or_task_id='task1',
        func=func_to_task_1,
        job_executor='async',
        max_running_jobs=5
    )

    print(scheduler.get_tasks())

    """
    [print result]

    [Task(id='task1', func='__main__:func_to_task_1', job_executor='async',
    max_running_jobs=5, misfire_grace_time=None, metadata={}, running_jobs=0)]
    """

    scheduler.add_job(
        func_or_task_id='task1'
    )

    print(scheduler.get_tasks())

    """
    [print result]
    [Task(id='task1', func='__main__:func_to_task_1', job_executor='threadpool',
    max_running_jobs=1, misfire_grace_time=None, metadata={}, running_jobs=0)]
    """

In the result of my execution you can see that the .add_job method overrode some of my task settings. Like the max_running_jobs and job_executor fields.

agronholm commented 3 months ago

Ok, I understand the problem now, and it's a design issue. I'll have to refactor the add_task() data store method.

mmmcorpsvit commented 3 months ago

@agronholm , sorry for my question, are there any fix updates?

agronholm commented 3 months ago

I'm making some progress once in a while, but it seems that every time I fix something, I uncover another problem. The rabbit hole is deep :frowning_face: I'll get it done Soon(tm). But I have people in other projects constantly asking for updates, not just APScheduler...

agronholm commented 2 months ago

The hard work on AnyIO's next release is done, so I can focus on this now. Getting incremental updates to task configuration is the crux of the problem here. I'm still working on a solution to that.

agronholm commented 2 months ago

Sorry for the delay. I'm having a bit of trouble fixing AsyncScheduler.configure_task() to work with the data stores in a sane way, so I'm currently experimenting with different ways of implementing full and partial task updates. This might take some time, unfortunately.