agronholm / apscheduler

Task scheduling library for Python
MIT License
6.31k stars 713 forks source link

Support on APScheduler to AsyncEngine #966

Closed brunolnetto closed 2 months ago

brunolnetto commented 2 months ago

Things to check first

Feature description

I am trying to use an asynchronous engine on SQLAlchemyJobStore. When defining job stores and starting the scheduler, I get following error: 'AsyncEngine' object has no attribute '_run_ddl_visitor'. When willapscheduler support async engine?

Use case

def create_scheduler(schedule_type, engine):
    jobstores = {"default": SQLAlchemyJobStore(engine=engine)}

    executors = {
        "default": {"type": "threadpool", "max_workers": 20},
        "process": {"type": "processpool", "max_workers": 5},
    }

    job_defaults = {"misfire_grace_time": 15 * 60}

    if schedule_type == "background":
        return BackgroundScheduler(
            jobstores=jobstores, executors=executors, job_defaults=job_defaults
        )
    elif schedule_type == "asyncio":
        return AsyncIOScheduler(
            jobstores=jobstores, executors=executors, job_defaults=job_defaults
        )
    else:
        raise ValueError(f"Invalid schedule type: {schedule_type}")
agronholm commented 2 months ago

It already does in master and the 4.0 pre-releases.

brunolnetto commented 2 months ago

May you help me understand why am I stumbling on such error when using create_async_engine?

agronholm commented 2 months ago

It's because you're using APScheduler 3.x which does not support async engines. Only the 4.x series (and the master branch) does. APScheduler 3.x will never support async engines.

brunolnetto commented 2 months ago

Cool. It seems the imports below will not work properly, right?!

from apscheduler.schedulers.base import BaseScheduler
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.interval import IntervalTrigger
from apscheduler.triggers.cron import CronTrigger
from apscheduler.triggers.date import DateTrigger
from apscheduler.jobstores.base import JobLookupError
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
agronholm commented 2 months ago

Yes, those are imports for the 3.x series.

brunolnetto commented 2 months ago

May you help me smoothly transition to 4.x?

agronholm commented 2 months ago

You could just read the documentation on that.

agronholm commented 2 months ago

Note that 4.x is in an alpha stage, and not generally recommended for production, as there are some existing unresolved issues. If you want to try your luck, you should probably use PostgreSQL as it is the best tested storage back-end and event broker (which is a new concept in 4.x).

brunolnetto commented 2 months ago

I want to refactor this file. Is it hard work based on your experience?

https://github.com/brunolnetto/cnpj-api/blob/main/backend/app/scheduler/base.py

agronholm commented 2 months ago

That's very subjective. I only took a cursory look at it but it's not that much code in my opinion. If you're using the async scheduler, you may have to learn some new paradigms (see the examples in master for that), and that will probably be the hardest part.

brunolnetto commented 2 months ago

HI, it is me again. The new library is broad and attends multiple purposes, congratulations! I installed from Pypi, but it comes not the same as on master branch (Example: TaskDefaults is missing somehow). Do you know the reason why?

agronholm commented 2 months ago

Because there is no final release in the 4.x series, and it would be a pretty bad idea for pip to install a pre-release version by default, wouldn't you agree?

brunolnetto commented 2 months ago

Agreed. I am not that experient, but I learn much on this discussion. Can you see a workaround to this async issue with previous version? Any pre-release which you recommend most?

For my use case,, a subset of current functionalities builds the steps:

  1. Define tasks on an array of TaskConfig objects;
  2. Create an object TaskOrchestrator with 2 schedulers: Scheduler (task_type equal to background) and AsyncScheduler (task_type equal to asyncio);
  3. Build the triggers for each based on these configs and add_schedule;
  4. Add schedule to task for each created trigger.

Based on above steps, should I also verify if task_type is either 'background' or 'asyncio' and run respectively run_in_background or run_until_stopped? Thanks once more!

agronholm commented 2 months ago

I'm not giving out free consultation for your project. But if you find a problem with APScheduler itself, let me know.

One thing I'm wondering about though: why do you have two schedulers in the same process?

brunolnetto commented 2 months ago

Well, a wise decision. The specific use case is: verify if some tables of a database have reached certain row threshold OR rows age is greater than certain threshold. I am not fully aware of how to use the asyncio scheduler, naturally to perform an async task. In my case, the Scheduler seems enough, don't you think?

agronholm commented 2 months ago

You can schedule and run both synchronous and asynchronous tasks with either AsyncScheduler or Scheduler. It's probably not necessary for you to use both at the same time.

brunolnetto commented 2 months ago

Would you pre-release the master version on pypi pleeease? :)

WillDaSilva commented 2 months ago

@brunolnetto it's possible to use pip (or basically any other Python package management tool) to install directly from a git repository. If you want to use the pre-release version, I suggest you go that route.