linkedin / linkedin-gradle-plugin-for-apache-hadoop

Apache License 2.0
117 stars 76 forks source link

First PR for Yaml output for Flow 2.0 #196

Closed reallocf closed 6 years ago

reallocf commented 6 years ago

This PR introduces the YamlCompiler, YamlWorkflow, YamlJob, and YamlProject objects and a suite of unit tests to cover them.

More details here: #193

jamiesjc commented 6 years ago

Is it possible to also add test cases to verify yaml file generated from workflow gradle? Something similar to: hadoopdsl/linkedin-gradle-plugin-for-apache-hadoop/hadoop-plugin-test/src/main/gradle/positive/workflows.gradle

nntnag17 commented 6 years ago

Hi Charlie,

I and Pranay will have a look at the code once we have the design review done internally. I think we should also hold off any further changes till the design is reviewed and finalized.

reallocf commented 6 years ago

Hey Anant,

Thank you! I think the Flow 2.0 design is mostly complete. @jamiesjc has the latest updates. We'll get back to you with more info next week after thanksgiving. I think Jamie is off through late next week, so might not be until the week after.

Or do you think the hadoopDSL upgrade deserves its own design doc? We haven't thought it necessary up to this point because many of the details are described in Jamie's doc.

reallocf commented 6 years ago

@jamiesjc I'll be adding a number of end-to-end tests validating proper yaml conversion in the next PR. I'm doing it then because the PR will include the infrastructure necessary to manually specify which output (.flow/.project or .job/.property) the user will want to use.

This infrastructure must be in place in order to properly test the entire pipeline (the tests will specify that they want the yaml outputs)

jamiesjc commented 6 years ago

The changes look good to me. Left some minor comments.

reallocf commented 6 years ago

I've already prepared a second commit that I'll send in another PR for integrating this change with the hadoopDSL, adding the ability for users to specify if they want to generate YAML or not, and adding integration tests.

The only meaningful thing that is different in that PR is: In YamlCompiler, visitProject() is guaranteed to only be called once per execution of the hadoopDSL, instead of once for every visited ScopeContainer. The infrastructure to make that happen is introduced in the new PR, so I didn't include it here.

reallocf commented 6 years ago

@nntnag17 if you or somebody from your team could take a look at this over the next few days, I'd really appreciate it! :smiley:

nntnag17 commented 6 years ago

Hey @reallocf ,

I've started the review. Will finish it soon.

pranayhasan commented 6 years ago

I too have started it. Would complete my review over the weekend.

reallocf commented 6 years ago

Anything else you need from me on this PR? Thanks for everybody's help!

reallocf commented 6 years ago

@nntnag17 @pranayhasan any further suggestions?