mean-street / DistributedSystemForDataManagement

0 stars 0 forks source link

Docker Spark #3

Closed Vayel closed 6 years ago

Vayel commented 6 years ago

https://hub.docker.com/r/epahomov/docker-spark/

Vayel commented 6 years ago
$ docker pull epahomov/docker-spark:lightweighted
$ git clone https://github.com/Mean-Street/DistributedSystemForDataManagement
$ cd DistributedSystemForDataManagement/spark
$ sbt package
$ docker run -it -p 4040:4040 -v $(pwd)/target/scala-2.11/sdtd_2.11-1.0.jar:/sdtd.jar -v $(pwd)/data/lines.txt:/data.txt epahomov/docker-spark:lightweighted /spark/bin/spark-submit --class "DataCleaning" /sdtd.jar