[DCOS-59720] Introduce Spark Job for MWT

mesosphere / spark-build

Used to build the mesosphere/spark docker image and the DC/OS Spark package

52 stars 34 forks source link

What changes were proposed in this pull request?

Resolves DCOS-59720 [DS] [Spark Operator] Create a better Spark Job for MWT

This PR introduces two Spark applications: 1) DatasetGenerator - creates a dataset with specified record count and record size and writes the result on s3 bucket. 2) DatasetSort - reads data from s3 location and perform sort operation on the obtained Dataframe

How were these changes tested?

Spark Applications were tested locally;
By running integration tests in CI

Release Notes

n/a

mesosphere / spark-build

[DCOS-59720] Introduce Spark Job for MWT #556

What changes were proposed in this pull request?

How were these changes tested?

Release Notes