Open pegasas opened 11 months ago
Good idea, we may need to design the TriggerAPI
and TriggerEvent
structure first.
Now architecture quartz scheduler uses db as Pessimistic Concurrency Control, when job is execute we insert command into DB.
Deep dive if we can leverage quartz interface or we needs to implement ourselves in TriggerService
. If so, we should leverage current distributed lock & master failover mechanism.
We have table t_ds_command
table, all trigger event should transform to command.
Quartz
is an plugin of ScheduleTrigger
, we may have other schedule plugin in the future.
Thanks wenjun for continuous guidance in this thread and very solid architecture design in dolphinscheduler!
These days I've been thinking the relationship of trigger & scheduler service responsibility in new architecture. In my current high-level design.
Pull mode is more common and needs resource, I am thinking integrate pull trigger into a special task type, which could be dispatched by master/worker, leverage current task architecture which can master/server failover.
Scheduler: pull trigger instance on this master host -> command
Good idea, we may need to design the TriggerAPI and TriggerEvent structure first.
As you mentioned, I've trying to design triggerAPI in code level. now I am learning previous task/registry/scheduler design & code pattern.
Quartz is an plugin of ScheduleTrigger, we may have other schedule plugin in the future.
I got your point from your scheduler api refactor, but it may be hard to do mult steps. In first stage maybe we can extends quartz scheduler in current architecture design. For long-term fix I may think the relationship & boundary between trigger
, schedule
and command
.
My current design is as below:
┌─────────────────────┬────────────────────┬────────────────────┬────────────────────────┬───────────────────────────────┐
│ │ │ │ │ │
│ UI │ API │ DB │ Registry │ Master │
│ │ │ │ │ │
├─────────────────────┼────────────────────┼────────────────────┼────────────────────────┼───────────────────────────────┤
│ │ │ │ │ │
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────┐ │ │ ┌─────────────────────────┐ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ User ├───┼──►│ Create ├──┼──►│ Trigger ├──┼──────────Pull──────────┼──► TriggerTaskThreadPool │ │
│ │ │ │ │ Trigger │ │ │ │ │ │ │ │ │
│ └─────┬───────┘ │ └─────────────┘ │ └─────────────┘ │ │ └───────────┬─────────────┘ │
│ │ │ │ │ │ │ │
│ │ │ ┌─────────────┐ │ ┌─────────────┐ │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │
│ └─Push──────┼──►│ Request ├──┼──►│ Schedule ◄──┼────────────────────────┼──────────────┘ │
│ │ │ │ │ │ │ │ │ │
│ │ └─────────────┘ │ └─────┬───────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ ┌────────────────────────┐ │
│ │ │ │ │ │ │ │ │
│ │ │ └──────────┼────────────────────────┼───► SchedulerApi │ │
│ │ │ │ │ │ │ │
│ │ │ │ │ └───────────┬────────────┘ │
│ │ │ │ │ │ │
│ │ │ ┌─────────────┐ │ │ │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ Command ◄──┼────────────────────────┼───────────────┘ │
│ │ │ │ │ │ │ │
│ │ │ └─────┬───────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ ┌─────────────────────────┐ │
│ │ │ │ │ │ │ │ │
│ │ │ └──────────┼────────────────────────┼─►MasterSchedulerBootStrap │ │
│ │ │ │ │ └─────────────────────────┘ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
└─────────────────────┴────────────────────┴────────────────────┴────────────────────────┴───────────────────────────────┘
TriggerTaskExecutorThreadPool
in master, which should align with logic task executor
We need to define how to assign the trigger to Master, since we use pull
mode, so all Master will pull the trigger from DB, we need to make sure one trigger will only be consumed by one Master.
We need to define how to assign the trigger to Master, since we use
pull
mode, so all Master will pull the trigger from DB, we need to make sure one trigger will only be consumed by one Master.
Thanks for Wenjun's support.
Here's current quartz architecture. DolphinScheduler uses quartz scheduled by timer trigger and insert command into DB. For quartz scheduler, it use exclusive lock while acquiring Triggers and fired them to get JobDetails for executing in hosted threadpool.
Actually it use DB as distributed lock solution.
Currently quartz not share public interface for us for implementation our own trigger mechanism.
Here's the high-level steps we implement this feature:
RegistryClient
I am still considering a better solution that we can compatible with quart scheduler in first version, but it seems a bit hard. Luckily ds has few steps on it.
Step #1: create related table schema for review Step #2: create trigger load plugin manager Step #3: backend development, add a test api for local testing, draft api review for testing Step #4: frontend development
Step #1: create related table schema for review Step #2: create trigger load plugin manager Step #3: backend development, add a test api for local testing, draft api review for testing Step #4: frontend development
Is there any progress? ^_^
t_ds_trigger_definition
table for triggert_ds_command
, note that this operation needs to be transactionalHere's the question,
currently our command fetching are based on id-based algorithm, the key point is when ProcessInstance is generated, we delete related command record.
https://github.com/apache/dolphinscheduler/blob/dev/dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/process/ProcessServiceImpl.java#L299
but we should not delete t_ds_trigger_definition
in this scenario.
One solution is we add one more table to record offset for fetching, once master is alive, every time master will firstly acquire lock, then fetching t_ds_trigger_definition
table for trigger and execute, finally we update offset into db with trigger_instance within one transaction.
for more gracefully, we can do it in consistent-hash way. just like memcached & brpc. https://github.com/apache/brpc/blob/master/src/brpc/policy/dynpart_load_balancer.cpp#L83consider to its complexity, I will use method #1 for implementation.
- master fetch
t_ds_trigger_definition
table for trigger- execute trigger, if it meets user-custom condition, insert a command into
t_ds_command
, note that this operation needs to be transactionalHere's the question, currently our command fetching are based on id-based algorithm, the key point is when ProcessInstance is generated, we delete related command record. https://github.com/apache/dolphinscheduler/blob/dev/dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/process/ProcessServiceImpl.java#L299 but we should not delete
t_ds_trigger_definition
in this scenario.One solution is we add one more table to record offset for fetching, once master is alive, every time master will firstly acquire lock, then fetching
t_ds_trigger_definition
table for trigger and execute, finally we update offset into db with trigger_instance within one transaction.for more gracefully, we can do it in consistent-hash way. just like memcached & brpc. https://github.com/apache/brpc/blob/master/src/brpc/policy/dynpart_load_balancer.cpp#L83consider to its complexity, I will use method #1 for implementation.
@ruanwenjun Can you help guide this design?
Search before asking
Description
After seeing all local & large distributed job-scheduling framework, I found that we may leverage our excellent plugin framwork design and consider extending a key process in job-scheduling ---- Trigger, which allows user to defines their custom trigger in distributed runtime environment.
Apparently quartz has considered this situation, which is a part of master scheduler, but not extend into distributed executing semantics. seeing https://www.quartz-scheduler.org/documentation/quartz-2.1.7/tutorials/tutorial-lesson-04.html
In distributed job-scheduling environment, we could aligned all trigger into a event trigger.
Use case
Benefits from the best plugin in dolphinscheduler. We should consider to add trigger SPI plugin loaded by master server.
Overview:
Detail Design
Controller Layer
Compatibility
DB Layer
t_ds_trigger
t_ds_schedules
Related issues
[DSIP-16][Task] Support stream task
Are you willing to submit a PR?
Code of Conduct