Open christopheblin opened 2 years ago
I've been thinking about what the best way would be to implement this. Here's my initial thoughts:
scheduledTasks: ScheduledTask[]
so that plugins can add their own tasks as part of the configure
metadata function.OK, so that's the easy part. Now the question is how to actually implement this.
Key challenge: We need to keep in mind the fact that Vendure might be running multiple servers & workers, e.g. a Kubernetes setup. So we can't just naively install node-cron
and start a job on the server or worker process. That will lead to the job being run multiple times, once for each running instance.
This suggests that we need some external store to coordinate the running of tasks. For example, a new table in the database like:
task | lastRun | lastRunProcessId |
---|---|---|
clean-job-queue | 2022-02-24T00:00:00Z | 1234 |
Then when the cron mechanism triggers on a given process, we first look up this table to see whether this particular task has run within some time delta (e.g. 5 seconds), and if not then we trigger the task.
I like this alot! Right now, every scheduled job I have needs to be configured outside my codebase. I want to stay in my IDE env! :wink: (or store it in DB)
An aspect to keep in mind is that alot of cloud services don't allow any background processes outside a request context. Everything needs to happen between incoming request
and res.send()
Any action needs to be invoked by an HTTP request from outside the application.
This goes for
Maybe we can call an endpoint on Vendure every X minutes, and Vendure handles any scheduled job internally? Preferably with X being configurable, as these serverless env's bill per usage.
Haha I actually want the opposite of @martijnvdbrug , I'm working on implementing a multi-vendor marketplace and I have various vendors who will be interacting with plugins that have jobs in them and those jobs need to be run at a vendor defined schedule. So for my use case it would be ideal if not only were we storing a log of running tasks, but we were also storing the schedule itself. Currently having to externalize this to an external service that invokes these jobs and is controlled via a separate plugin.
@xrev87 Schedules/crons in DB are also fine! I just meant not in any external platforms like I do now with Google Cloud Scheduler :smile:
@martijnvdbrug this is a great point. So the triggering mechanism must also be configurable somehow. So we can trigger either by an internal cron job, or trigger via a HTTP call to an endpoint. How would you control the HTTP call? Does Google Cloud have some built-in way of doing that? Or would you need to use some other system to make the calls?
@xrev87 Can you go into more detail on this part?
So for my use case it would be ideal if not only were we storing a log of running tasks, but we were also storing the schedule itself. Currently having to externalize this to an external service that invokes these jobs and is controlled via a separate plugin.
What does your current setup look like? And what would your ideal solution look like?
Like @martijnvdbrug, we are also running a setup on Google cloud and i have a few comments.
First, tasks have to be invoked from an external source since Cloud Run will kill any process that doesn't directly serve a request. This is why, for example, we don't use a worker process - we only have the main app which handles worker tasks and let cloud run handle the scaling when needed. This has proven to work quite well so far.
We use @martijnvdbrug's google cloud tasks plugin. A side note here is that google cloud tasks are great since I can pause a queue, retry individual tasks or even purge a queue, which has been invaluable when importing large data and dealing with 3rd party integrations (dearsystems).
Finally, we also use google cloud scheduler for cron jobs. It just pings a REST endpoint to trigger a task with an optional payload. For now we define scheduled tasks in google cloud directly, but there is also a REST API for managing them: https://cloud.google.com/scheduler/docs/reference/rest
When designing a scheduler, I would suggest that we make it support cloud native services. This can be done by passing some basic async functions for creating/deleting/listing scheduled tasks, as well as a configurable REST route that will trigger a task execution. Using this interface we can easily develop a plugin for google cloud scheduler and other cloud services.
Finally, this is a great addition to vendure - indeed needed.
@martijnvdbrug this is a great point. So the triggering mechanism must also be configurable somehow. So we can trigger either by an internal cron job, or trigger via a HTTP call to an endpoint. How would you control the HTTP call? Does Google Cloud have some built-in way of doing that? Or would you need to use some other system to make the calls?
@michaelbromley as @skid said, Google has Google Scheduler, which is basically an external CRON service with a UI over it that triggers HTTP endpoints.
2 possible solutions I can think of
<vendure-server>/some-ticker-endpoint
and Vendure decides which job has to run based on some schedule config.I would prefer option 1, as it's more flexible: you are not dependent on how jobs can be configured in Google's platform.
These are just initial ideas though!
@martijnvdbrug this is a great point. So the triggering mechanism must also be configurable somehow. So we can trigger either by an internal cron job, or trigger via a HTTP call to an endpoint. How would you control the HTTP call? Does Google Cloud have some built-in way of doing that? Or would you need to use some other system to make the calls?
@xrev87 Can you go into more detail on this part?
So for my use case it would be ideal if not only were we storing a log of running tasks, but we were also storing the schedule itself. Currently having to externalize this to an external service that invokes these jobs and is controlled via a separate plugin.
What does your current setup look like? And what would your ideal solution look like?
I'm running a multi-vendor marketplace where we have various EDI solutions in place that are facilitated by vendor specific plugins. Due to the nature of these plugins there is a requirement for data ingress/egress at different cadences.
At the moment these plugins do this through jobs, but with a lack of scheduling inside Vendure, we have a scheduler plugin which is simply a thin wrapper around our orchestrator which is essentially kicking these jobs off via mutation.
Due to the fact we need vendors to be able to set the schedule for a particular job, it's not sufficient to be able to globally set the schedule for a queue or particular type of job.
Thank you all for the feedback on this. I think that, to do this right, it is a bit of a bigger task than I first imagined. From all of your feedback it looks like there are different mechanisms that could be used to actually trigger the scheduled task:
Additionally, the solution needs to work correctly in multi-instance deployments.
Because of these different use-cases, the design needs to elegantly handle all of these. Let's reach for our friend the Strategy pattern once again.
So here's a very rough, high-level sketch of a possible implementation:
First some types:
// This represents what a task actually does, e.g. hit a HTTP endpoint or run a command
// depending on the strategy
type TaskAction = string;
interface Task {
name: string;
action: TaskAction;
}
interface ScheduledTask {
id: string;
description: string;
schedule: Schedule; // ?? not sure what this would look like
task: Task;
}
Then the strategy:
interface CreateTaskInput {
description: string;
task: Task;
schedule: Schedule;
}
export interface TaskSchedulerStrategy extends InjectableStrategy {
getTasks(): Promise<ScheduledTask[]>;
createTask(input: CreateTaskInput): Promise<ScheduledTask| undefined>;
updateTask(input: UpdateTaskInput): Promise<ScheduledTask>;
deleteTask(id: ID): Promise<DeletionResult>;
}
Then in the VendureConfig we would add a new config object:
const config = {
// ...
taskSchedulerOptions: {
taskSchedulerStrategy: TaskSchedulerStrategy;
tasks: Task[];
}
}
In the Admin UI we'd add a new page in the "system" section which allows the admin to perform CRUD on tasks.
Feedback welcome!
The schedule can look like a normal crontab string. I reckon there are robust libraries to parse it already and everyone's familiar with it.
I would also add an optional params
string to be used by the strategy implementation. For example some HTTP task schedulers can POST data to their endpoint.
Did you have a look at the nest-schedule module if this could do all the scheduling? https://www.npmjs.com/package/nest-schedule
@marcus-friedrich Nest's Schedule module is OK for situations where you are certain you're not going to run multiple instances. But as soon as you run multiple instances (e.g. k8s or serverless scenarios) you will run into double-scheduling of tasks.
That's why we're look at a more high-level abstraction of scheduling, of which a simple cron-like in-process implementation could be one version.
From a Slack thread:
How do you make sure to only run the cronjob once when using a multiple server instances?
We do that with locks via redis
We have this in a service
public runWithLock(lockKey: string, runFunction: () => void, lockTimeMinutes = 15) {
return new Promise<boolean>(async (resolve, reject) => {
const client = await this.configService.getRedisClient()
if (!client) {
reject("Couldn't get redis client")
return
}
try {
const res = await client.setNX(lockKey, 'a')
if (res) {
await client.expire(lockKey, 60 * lockTimeMinutes) // lock for 15 minutes by default
try {
await runFunction()
} catch (e) {
this.logger.error(`Error running function.`, undefined, e)
this.logger.debug(`Delete lock key ${lockKey}`)
client.del(lockKey).then(() => {
reject(e)
})
}
this.logger.debug(`Delete lock key ${lockKey}`)
// keep the lock a little to prevent another start by a late worker
await sleep(2000)
client.del(lockKey).then(() => {
resolve(true)
})
} else {
this.logger.debug(`Locked key ${lockKey}. May run somewhere else.`)
resolve(true)
}
} catch (err) {
if (err) {
this.logger.error(`Error getting lockKey ${lockKey}.`, undefined, err)
reject(err)
}
}
}).catch(err => {
this.logger.error(`Something went wrong.`, undefined, err)
this.logger.error(err)
})
}
Is your feature request related to a problem? Please describe.
The database is "unexpectedly" big due to the size of job_record and session tables
These 2 tables contain mostly stale data (old jobs and expired sessions) and nothing is cleaning them periodically
Describe the solution you'd like
For job_records, it seems that periodically calling removeSettledJobs internally would do the trick
For sessions, something similar should be done
See https://vendure-ecommerce.slack.com/archives/CKYMF0ZTJ/p1644844460660589?thread_ts=1644830594.467479&cid=CKYMF0ZTJ
Describe alternatives you've considered
For jobs, calling removeSettledJobs externally and config.jobQueueOptions.enableWorkerHealthCheck = false to reduce the number of jobs
For sessions, a manual db query
Additional context
For sessions, it seems there is an additional bug for anonymous sessions because they are always created for a duration of 1y no matter the setting config.authOptions.sessionDuration
see https://github.com/vendure-ecommerce/vendure/blob/6bc01aae230646118fac3c103793896b326d68c6/packages/core/src/service/services/session.service.ts#L108-L110