nlesc-sherlock / emma

Ansible playbook to create a cluster with GlusterFS, Docker, Spark and JupyterHub services
Apache License 2.0
3 stars 4 forks source link

Remote debugging #76

Closed romulogoncalves closed 6 years ago

romulogoncalves commented 6 years ago

The pull request is to provide information how to set Spark for debugging and how to collect info about the Garbage Collector, important when debugging Spark jobs.

Next to it we would provide information on how an user can install extra stuff in the cluster using pip and apt-get. This is handy when we are working on Jupyter notebooks and we have a dependency. The same holds for a Python library that can't be installed using pip or apt-get but you need to have it at the cluster to import it to your notebook.

All the rest are minor things collected over the past 2 months. The several merges you see are because I create the branch (by mistake) from phenology branch instead of the master branch.

romulogoncalves commented 6 years ago

There was an extra commit which is to control the creation of swap space using a global variable. Since I learned how to do it, I decided to do this extra commit.