gaia-pipeline / gaia

Build powerful pipelines in any programming language.
Apache License 2.0
5.2k stars 245 forks source link

Job outputs? #225

Open prologic opened 5 years ago

prologic commented 5 years ago

Going through the documentation, examples and the Go SDK (I assume the other language SDKs are the same); It doesn't seem possible to have a job compute and return some output which you may use as input into the next job(s) in your pipeline.

The only thing a Job can do is take some input Arguments and return an error. I assume the design calls for inputs to be known up front.

What if I need some inputs to a Job in my pipeline whose values are dependent on a previous job? This doesn't seem possible right now; Is this by design? Does this add considerable complexity to something like Gaia?

In a "normal" gRPC/Protobuf service oriented architecture (which Gaia is loosely based around; but behaves more like server-less, faas...) you would expect to be able to return some "output" from your service's endpoints/functions/etc.

Thoughts?

michelvocks commented 5 years ago

Hi @prologic!

You are absolutely right. This is currently not possible cause we didn't implement this feature yet. It has been on the list for a long time but other features were prioritized over this. I will label this as a feature request where we can track the progress of it.

Cheers, Michel

prologic commented 5 years ago

Awesome! Thanks for the confirmation. I wouldn't be able to replace all this gnarly Jenkins spaghetti I have here with a Gaia workflow and appropriate implementation of "Jobs" without this so looking forward to having this feature implemented!

I think the challenge here would be to define a "protocol" / "format" that Jobs can "return ouytput" in a sane and consistent way. My recommendation based on the code I'm seeing and architecture/design is to have a "Context" object (similarly to the Arguments object) whereby an author of a job can insert arbitrary key/value pairs into the job's context. To make use of this in a workflow/pipeline the "context" would have to be persisted alone the DAG.

prologic commented 4 years ago

Any updates on this?

Skarlso commented 4 years ago

Hi @prologic. Unfortunately both of us have been very busy with life lately and some other commitments. If I recall correctly, @michelvocks first would like the docker executor in because that's a massive change. That will go in sometime this week and then we can start working on something else. :)

Also, I'm hoping my schedule will get better in a few weeks or so, then I can start concentrating on Gaia a little bit more again. :) That would be nice as I have a few things on my list that I would like to work on. :D Cheers for your patience.

prologic commented 4 years ago

Hi @prologic. Unfortunately both of us have been very busy with life lately and some other commitments. If I recall correctly, @michelvocks first would like the docker executor in because that's a massive change. That will go in sometime this week and then we can start working on something else. :)

Are there future plans for a Kubernetes/Nomad executor too at some point?

prologic commented 4 years ago

Also, I'm hoping my schedule will get better in a few weeks or so, then I can start concentrating on Gaia a little bit more again. :) That would be nice as I have a few things on my list that I would like to work on. :D Cheers for your patience.

Sounds good. Let me know if I can help in any way, docs, design testing, etc.

Skarlso commented 4 years ago

Hi @prologic. Unfortunately both of us have been very busy with life lately and some other commitments. If I recall correctly, @michelvocks first would like the docker executor in because that's a massive change. That will go in sometime this week and then we can start working on something else. :)

Are there future plans for a Kubernetes/Nomad executor too at some point?

Not that I know of.

Could you elaborate on what do you mean by Kubernetes executor? Do you mean a CRD + Operator?

prologic commented 4 years ago

Could you elaborate on what do you mean by Kubernetes executor? Do you mean a CRD + Operator?

Maybe it would help if you described what this docker executor is? Or point me to a PR or Issue?

Skarlso commented 4 years ago

Sorry @prologic I missed your reply! Here is the PR: https://github.com/gaia-pipeline/gaia/pull/201 :) It has been merged. So we can now move on with other things. My other project is also done-ish so I'm going to focus on Gaia some more. ;)

prologic commented 4 years ago

That's awesome! Great job!

Skarlso commented 4 years ago

Alright. Let's take a look at this. :)

Skarlso commented 4 years ago

So the way I see it, it's possible that a job can have a return value, but that value would have to be very generic as jobs could have multiple types of outputs...

I propose a list of key value pairs. Something like, "DNS": "whatever.com". And your job which is waiting for something knows what it wants so it can look for a key like "DNS".

@prologic @michelvocks What do you think?

Skarlso commented 4 years ago

Something like...

message Output {
   repeated OutputValue items;
}

message OutputValue {
   string key = 1;
   string value = 2;
}
prologic commented 4 years ago

I propose a list of key value pairs. Something like, "DNS": "whatever.com". And your job which is waiting for something knows what it wants so it can look for a key like "DNS".

This sounds perfect!

prologic commented 4 years ago

Yup just a simple KV map would work very nicely here. I would not support anything beyond this.

Skarlso commented 4 years ago

@prologic Almost done. :) Now need to test this thing and write some unit tests and have a review from Michel. :)

prologic commented 4 years ago

@prologic Almost done. :) Now need to test this thing and write some unit tests and have a review from Michel. :)

I'm more than happy to spin up a new Gaia instance to test your PR too :) if that helps!

Skarlso commented 4 years ago

Cool. :) You'll have to build it though because I had to edit the import paths and such. If you're okay with that, that would be a lot helpful. :)

prologic commented 4 years ago

Cool. :) You'll have to build it though because I had to edit the import paths and such. If you're okay with that, that would be a lot helpful. :)

Sure no problems! Just make sure your PR has a "Test Plan" I can follow and I'll find some time to test your stuff this week :)

Skarlso commented 4 years ago

Cool. :) You'll have to build it though because I had to edit the import paths and such. If you're okay with that, that would be a lot helpful. :)

Sure no problems! Just make sure your PR has a "Test Plan" I can follow and I'll find some time to test your stuff this week :)

Absolutely. I'll update the PR with detailed instructions. :)