Scheduler state or data shared across the jobs

vsavovski commented 10 months ago

It would be nice to add a feature for sharing app state across the jobs, something similar like https://docs.rs/actix-web/4.4.1/actix_web/web/struct.Data.html


// This struct represents state
struct SchedulerState {
    config: AppConfig,
    email: EmailClient,
    http: HttpClient,
}

#[tokio::main]
async fn main() {
    let subscriber = FmtSubscriber::builder()
        .with_max_level(Level::TRACE)
        .finish();
    tracing::subscriber::set_global_default(subscriber).expect("Setting default subscriber failed");
    let sched = JobScheduler::new().await.unwrap();

    let state = Arc::new(SchedulerState::new();
    sched.add_data(state.clone());

    run_example(sched).await;
}

pub fn get_ip_job() -> Result<Job, JobSchedulerError> {
    Job::new_one_shot_async(Duration::from_secs(0), move |_uuid, l| {
        let http = l.data.http.clone();

        Box::pin(async move {
            let resp = http
                .get("https://httpbin.org/ip")
                .send()
                .await
                .unwrap()
                .json::<HashMap<String, String>>()
                .await
                .unwrap();
            info!("{:?}", resp);
        })
    })
}

burkematthew commented 7 months ago

Totally agree. I'd love the ability to build a single database pool that is shared across jobs. I can't seem to do that today without creating a unique database pool for each Job, which I don't really want to do.

mvniekerk commented 3 months ago

Hi @vsavovski and @burkematthew, thanks for the issue, I appreciate it.

So on keeping global state. For axum and leptos, they have a hash of struct type vs state. That allows you to do a .get() (or whatever method) and the type inferred is the state you'll be getting back. I like that, it is helpful and damn elegant.

Now, for this project, it is allowed to have your own Job storage etc. That means, tasks and their state can be persisted onto Postgres or NATS. This allows, technically, you to write your own Job schedulers that can run on multiple nodes. With their data stored into "global state", as in the NATS K/V store or Postgres itself. Which means your computing nodes are stateless in that if one of them gets killed, a new one can get spawned up.

I think this is a cool feature to have, nonetheless.

mvniekerk / tokio-cron-scheduler

Scheduler state or data shared across the jobs #59