quinngroup / dr1dl-pyspark

Dictionary Learning in PySpark
Apache License 2.0
1 stars 1 forks source link

Spark development environment setup and install #41

Closed magsol closed 8 years ago

magsol commented 8 years ago

To set up your environment for Spark, here's what I recommend:

Set up an Ubuntu virtual machine (~20GB hard disk space, ~4GB memory, at least 2 CPU cores) with these settings. You'll likely also need to install git and Sublime Text again on the virtual machine so you can do your development there. However this is probably still the easiest way to work with Spark.

MOJTABAFA commented 8 years ago

@magsol I already installed above items but my system is slow now . Hence, I ordered an 8 GB slot ram + an SSD hard drive to cope with the problem of speed. Please let me know about next step. Thanks

magsol commented 8 years ago

To everyone in @quinngroup/bigneuron: if you are interested in a way to get thunder quickly up and running, check out my latest commit to the quinn-branch: https://github.com/quinngroup/pyspark-dictlearning/tree/quinn-branch/thunder-install .

I added a Dockerfile which builds a minimalist Ubuntu 14.04 environment using Anaconda and the right version of Spark to run thunder. Feel free to use that directly, or as a starting point for configuring your own environment.