USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
http://irds.usc.edu/sparkler/
Apache License 2.0
411 stars 143 forks source link

Dockerize Sparkler #40

Closed karanjeets closed 7 years ago

karanjeets commented 7 years ago

Dockerize Sparkler for "Pick and Crawl" framework.

@thammegowda : As discussed, please work on this.

thammegowda commented 7 years ago

Update : Docker is a good option when we need single machine. Suits for local mode.

For the cluster / distributed mode, docker becomes over kill. The better option is ubuntu juju charms https://jujucharms.com/

Investigating Juju Charms

pengale commented 7 years ago

If you want to go the charm route, I, and the rest of the Big Data team at Canonical, are happy to help out. You can find me on freenode#juju as petevg. Feel free to ask questions there, or call me out on any PRs in this repo.

thammegowda commented 7 years ago

We have a docker for local testing #52, so closing this issue

However, we are also working on JujuCharms for cloud deployment #50