Closed tusharbhasme closed 7 years ago
Great idea indeed!
In addition to job chaining, branching and parallel execution are also common requirements, but implementing all this stuff would lead to developing a complete workflow engine to orchestrate easy batch jobs.
Another solution is to try to write plugins for existing open source workflow engines in order to not reinvent the wheel :-)
Totally agreed, but I am afraid adding everything can render it as NotSo-EasyBatch. :-)
IMO, chaining does not need much work and will keep it "Easy". It could be as simple as just decorating Engine, to an Engine containing an Engine.
AFA branching/parallel execution is concerned, right now I cannot think of any easy way to add it.
an Engine containing an Engine
Yeah, we can speak about easy batch inception :-)
More seriously, as you said, developing the whole stuff would lead to a complex solution and this is not the DRY KISS philosophy behind easy batch ;-)
Chaining is not that difficult to implement, we could imagine a simple DSL to build job pipelines with the ability to start a job only if the previous one has finished/aborted under certain conditions. I think about something like:
JobPipeline jp = JobPipelineBuilder.aNewJobPipeline()
.startWith(job1)
.when(job1HasFinished).then(job2)
.when(job2HasProcessed80PercentOfRecords).then(job3)
.build();
Report finalReport = jp.run();
We can imagine a callback to be implemented by the user to specify in which condition start (or not) the next job in the pipeline:
public interface JobExecutionPredicate {
boolean apply(final Report previousJobReport);
}
In the previous example, job1HasFinished
and job2HasProcessed80PercentOfRecords
implement this interface and job1
, job2
, job3
are easy batch instances.
It's just a first idea. Feel free to share your thoughts!
Very nice idea! But I still want to keep the inception idea and make
class JobPipeline implements Engine {
so that we can
JobPipelineBuilder.aNewJobPipeline()
.startWith(JobPipelineBuilder.aNewJobPipeline().startWith(job1).then(job2).build)
.when(predicate1).then(job2)
.when(predicate2).then(job3)
.build();
The information contained in Report
of a JonPipelineEngine may be different than the information of an Engine.
To pass the job information through predicate, we could even pass the whole job to the predicate:
JobPipelineBuilder.aNewJobPipeline()
.when(EnginePredicate.Builder(JobPipelineBuilder.aNewJobPipeline().startWith(job1).then(job2).build()).build()).proceed()
.thenStart(EnginePredicate.Builder(job3).build())
.when(EnginePredicate.Builder(job4)).build()).proceed()
.thenStart(EnginePredicate.Builder(job5).build())
Just curious, Is this something like mixing easy-rules with easy batch. Where when is nothing but evaluating a condition, then is nothing but executing an action upon condition returning true.
There is no mixing of easy-rules to easy-batch. when/then are just a logical method names for executing next steps on meeting a condition.
On Wed, Jun 17, 2015 at 2:24 PM Sunand P notifications@github.com wrote:
Just curious, Is this something like mixing easy-rules with easy batch. Where when is nothing but evaluating a condition, then is nothing but executing an action upon condition returning true.
— Reply to this email directly or view it on GitHub https://github.com/EasyBatch/easybatch-framework/issues/66#issuecomment-112726050 .
Hey guys, we've got a new concept: Meta-inception! An easy batch engine running inside another easy batch engine which in turn is running inside an easy rules engine which in turn is running inside a ... Ok, I stop, Just kidding :-)
@gs-spadmanabhan In fact, we can implement the JobPipeline idea using Easy Rules. Since Easy Rules triggers rules in sequence, it can be seen as a conditional pipeline. Just for the fun, I've implemented it here, and it works! What do you think?
@tusharbhasme Even though making JobPipeline
implement the Engine
interface is technically possible, I do believe they represent different concepts at different levels of abstraction. So mixing both concepts would be a bit confusing. Do you agree?
The most important part is to design a simple to read DSL to create the pipeline. The implementation itself is not that difficult (be it based on Easy Rules or not)
@benas I would still suggest to bring JobPipeline under Engine since JobPipeline is nothing but an engine running engines. The main reason behind this is that we can then easily create a chain job of chain jobs.
@benas, I think the implementation makes sense, validates my theory ;-). As you said keeping DSL simple is the key whether it gets implemented in easy rules or not.
@tusharbhasme, I get that building under EasyBatch engine will result in less confusion, agreed. But I just proposed an alternative which is already existing.
What's running through my mind:
To make this possible merging 2 frameworks seems to be a good idea. Even though the 2 frameworks represents different concepts, but I see value in combining them both as a single framework. That's all folks my rambling is over. Thoughts??
Hey @benas, any updates over this requirement? It would be great if we could schedule the chain too.
Hey,
I like this, since I had nearly the same Request.
@tusharbhasme I wouldn't mix in scheduling in here since there are many frameworks which provide easy scheduling. We achieved this by using Spring Scheduling.
@benas I really like the easyRule approach! This is definitely what I'm looking for.
BR
Hello guys,
Now that version 4 is out (and what a release! Easy Batch has never been easier to understand and use), we can forget about the term engine and all confusions it brought. Easy Batch engine has been renamed to Job
(issue #141 ), this is less confusing and more natural name.
There are the 4 key concepts around jobs:
JobBuilder
: main entry point to build jobsJobExecutor
: execute jobsJob Monitor
: monitor jobsJob scheduler
: schedule jobsI was thinking about a new concept JobOperator
or JobOrchestrator
, inspired by JSR 352, section 7.3, which would be responsible for orchestrating jobs: chaining, branching, etc. To my opinion, as discussed in #128, it would be a kind of a "super" ExecutorService
that can start/stop/cancel jobs as well as orchestrate them (conditional execution, chaining, etc) What are your thoughts on that?
I didn't found a open source workflow engine that can orchestrate plain java.util.concurrent.Callable
objects. Do you know such a workflow engine?
Regards Mahmoud
Hi there,
First of all Congratulations and I am excited to see the changes in version 4, I haven't gone through the code base, but I will find time to go through.
I don't have much idea of workflow engine, the ones I have heard are jBPM and activiti. I don't know the internals of those framework since they are not easy :) all these pretty much deals with lot of XML Configuration.
Hi, Congratulations for the new Release.
Can you explain why you want to integrate an workflow engine?
To stop and start a Callable you can just use plain Java methods. To have the Preconditions I'd go for the easy rule solution.
The hard part would possibly be the branching and chaining.
Hi,
Thank you!
Can you explain why you want to integrate an workflow engine?
The goal is to orchestrate jobs to create complex workflows (branching & chaining). Easy Batch is designed to create and execute jobs but not to orchestrate them.
I was reading a interesting discussion on spring Batch forum where a user asked for how to deal with complex workflows of branching/chaining with Spring Batch. The project lead recommended to not use Spring Batch for job orchestration because it was not designed for that. I totally agree with him and this is also the case for Easy Batch.
To stop and start a Callable you can just use plain Java methods.
Sure! This what we discussed together in issue #128 :
"A job is a unit of work that can be submitted to an executor service which is responsible for it's life cycle (start, stop, cancel, calculate progress, handle timeout, etc)." So yes we can do it in plain Java.
But a the idea is to have a "super enhanced" ExecutorService that provides a DSL to do more than just start/stop/cancel jobs, but to orchestrate them, hence the proposed name JobOrchestrator
.
A good example is Flo for Spring XD , (demo). Do you see the idea?
@gs-spadmanabhan Thx! I was aware of jBPM and activiti. I agree with you, not so easy ..
Yeah sure got that @benas, but what I mean is, it's basically about to create kind of workflows. Basically I'd go for implementing it by workflow design patterns rather then integrating another framework.
Therefore I would start to identify the usecases and limit it to them in the first run. For example chaining and branching.
Yeah sure, integrating with an existing workflow engine is one possibility among others, in this issue we are trying to find the best way to implement Job orchestrating with a KISSable approach :wink:
Currently, here are the options:
I think we can stick and start with a very basic approach for chaining as requested at first by @tusharbhasme and provide something like:
JobPipeline jp = JobPipelineBuilder.aNewJobPipeline()
.startWith(job1)
.when(job1HasFinished).then(job2)
.when(job2HasProcessed80PercentOfRecords).then(job3)
.run();
After all, Easy Batch jobs are simple pipelines of record processors, so why not take the idea to the upper abstraction level (Job) and introduce JobPipeline
. This will exactly implement the requested feature JobChaining
(title of this issue).
Do you agree?
Any updates?
Nope, I didn't have time to work on this feature.
Why don't we create an EasyWorkflow project ourselves?
Hi,
Really sorry for not giving any update on this, I'm a bit in this situation ..
Good idea indeed! This would be great community driven effort to add flows support to Easy Batch. I would name it EasyFlow
: a simple, stupid workflow engine for Easy Batch :smile: don't hesitate to suggest ideas or a working prototype, you are very welcome.
Thank you Sunand! Best regards Mahmoud
Sunand,
I've published a unfinished working prototype (that I've stashed a few months ago) in branch feature-66
. You can already build job pipelines. See example here. There are some built-in predicates already.
The API is under the org/easybatch/core/flow
package. It can be a good starting point for you.
Looking forward for your feedback.
Kind regards Mahmoud
@tusharbhasme @MALPI
I've published version 4.1.0-SNAPSHOT with a working prototype of the JobPipeline
API. Currently, it supports job chaining as requested. See example here.
Don't hesitate to give it a try. Looking forward for your feedback and suggestions.
Kind regards Mahmoud
Awesome!!!
On Tue, May 17, 2016 at 6:09 PM Mahmoud Ben Hassine < notifications@github.com> wrote:
@tusharbhasme https://github.com/tusharbhasme @MALPI https://github.com/MALPI
I've published version 4.1.0-SNAPSHOT with a working prototype of the JobPipeline API. Currently, it supports job chaining as requested. See example here https://github.com/EasyBatch/easybatch-framework/blob/feature-66/easybatch-core/src/test/java/org/easybatch/core/flow/JobPipelineTest.java#L27 .
Don't hesitate to give it a try. Looking forward for your feedback and suggestions.
Kind regards Mahmoud
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/EasyBatch/easybatch-framework/issues/66#issuecomment-219704993
@benas Apologies for the delay in my response, this version of what you have put up is really good. But I still feel a job should have 2 states onSuccess and onFailure.
Solution
{
"name" : "WF1",
"description" : "Test Workflow Spec",
"type" : "DIRECT",
"jobs" : [ {
"jobName" : "Job1",
"jobDescription" : "Do First Job",
"parameters" : null,
"onSuccess" : [ "Job2" ],
"onFailure" : [ "Job3" ],
"onCompleted" : null,
"position" : 0
}, {
"jobName" : "Job2",
"jobDescription" : "Do Second job if the first job succeeds",
"parameters" : null,
"onSuccess" : null,
"onFailure" : [ "Job3" ],
"onCompleted" : null,
"position" : 1
}, {
"jobName" : "Job3",
"jobDescription" : "If this executes then first job has failed",
"parameters" : null,
"onSuccess" : null,
"onFailure" : null,
"onCompleted" : null,
"position" : 2
} ]
}
I am trying to put up some sample code but its taking time. Thoughts, suggestions welcome!
Hi Sunand,
This is a GREAT idea. I was also thinking of such kind of DSL (inspired by jenkins pipeline DSL). But not that easy to implement, you may agree. As you said, the JobPipelineBuilder
API I provided should add an "else" method to handle both states of the job. I'll see how to add this.
Another way to design job pipelines is something like: job1 && job2 || job3
. This is inspired by unix job control syntax. What an elegant syntax! My idea was to use an expression language (MVEL, Spring EL, or whatever). If only I had time to work on this issue all the day long :rage:
Regards Mahmoud
Hi Sunand,
I've tried to add an else
in the DSL to be able to write something like:
JobPipeline jobPipeline = aNewJobPipelineBuilder()
.startWith(job1)
.when(predicate1).then(job2).else(job3)
.when(predicate2).then(job4).else(job5)
.when(predicate3).then(job6).else(job7)
.build();
Even with this, it is NOT possible to achieve a comprehensive flow like I was expecting:
The reason is that the predicate defined by the user is applied to the last executed job, which is unknown at runtime with this new branching model. In the example above:
So the syntax is not correct and does not lead to the expected graph.
I do believe the best approach is to have a real graph of jobs like you did in your PR. But this is a lot of work and I really appreciate your effort. As you said, this is actually the scope of another project. I saw you already prepared a repo for that :wink: So I propose you lead the development of the solution (in PR #189) as a separate project and I will do my best to contribute. What do you think?
My attempt to provide a JobPipeline
API does work only for sequential job execution (hence the name pipeline in the API, or else it would be JobFlow
):
The predicate is the condition to continue to next job in the pipeline, otherwise, next jobs are skipped. Pretty basic, but as discussed, it is a first step toward implementing job chaining like requested first by @tusharbhasme and @MALPI .
Cheers Mahmoud
Hi Mahmoud,
Thanks for taking time in evaluating the proposal. I did create the repo but then I thought it really required all the Job
JobReport
JobResult
JobParameters
classes (with job class being able to take a certain definition) on top of which this graph can sit and dictate the workflow. If I write it separately will I be able to use it with EasyBatch?
I got really confused. But I will try to write using these classes then again it will have slow progress due to my current job.
Thanks again for validating the same.
Sunand
Hi Sunand,
Just add the easybatch-core
module as dependency in your project and you can use these APIs, they are public. This is how I developed all extensions.
The example of DAG in my last comment is easy to create with your approach and not even possible with the JobPipeline
API I've introduced. So as I said, your approach is the way to go (taking into account the couple of notes we've discussed in #189 ).
it will have slow progress due to my current job.
Same here. I'm really really afraid to not be able to work full time on this. But don't worry, take your time and keep me informed when you are ready, I'll be very happy to give you credits on that effort!
Best regards Mahmoud
@tusharbhasme @MALPI Have you got a chance to test this feature?
Would love to get your feedback
Hey Mahmoud,
I am now not a part of the project that needed it but I am glad this feature has been added. I will still try to test this feature with code I have and pass this info to the team working on it!
Thanks, Tushar Bhasme
On Fri, Jun 17, 2016 at 1:23 PM, Mahmoud Ben Hassine < notifications@github.com> wrote:
@tusharbhasme https://github.com/tusharbhasme @MALPI https://github.com/MALPI Have you got a chance to test this feature https://github.com/EasyBatch/easybatch-framework/issues/66#issuecomment-219704993?
Would love to get your feedback
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EasyBatch/easybatch-framework/issues/66#issuecomment-226704485, or mute the thread https://github.com/notifications/unsubscribe/AH9qnWhJpvQaxOaitSXVGPG2ZB3FVSyRks5qMlJigaJpZM4E9bI4 .
Hi @tusharbhasme @gs-spadmanabhan @MALPI
Finally I was able to release easy-flows. It provides all what we discussed here (chaining, branching, etc) easily 😉
I didn't found a open source workflow engine that can orchestrate plain java.util.concurrent.Callable objects. Do you know such a workflow engine?
Easy Flows is what I didn't found after a lot of search on the net. I really don't understand why every single workflow engine out there is trying to implement BPMN? There is nothing wrong with this notation, but it is not simple ( 538 pages spec?? ) and getting started is not easy with current defacto engines.
Anyway, Easy Batch jobs are callable objects and can be orchestrated with Easy Flows. All projects of jeasy are designed to work well together.
Let me go back to first comment of this issue.
Each job would define a set of requirements for it to execute eg, status of previous job in the chain.
In Easy Flows, a WorkReportPredicate
is what you are looking for.
This concept could easily be extended to create a chain of chain jobs.
A WorkFlow
in Easy Flows extends Work
concept, so flows are composable by design.
I hope this new library helps the community.
I'm closing this issue for now.
Kind regards Mahmoud
This is a very common job scenario where we want to run jobs in sequence/chain. Each job would define a set of requirements for it to execute eg, status of previous job in the chain. The report of the chain would contain the status of each job, no of jobs processed successfully, overall status of the chain, etc. This concept could easily be extended to create a chain of chain jobs.