Closed peterroelants closed 3 years ago
I can do this, but I have to work out how to make the storage space work with Docker. Without external storage, docker runs out of space and dies while you run pyspark. I need external storage, and I don't know how to configure that. I will look into this.
In the meantime I recommend EC2. Sorry.
I am thinking of doing this now that computers come with 32GB of RAM more often. I still need to solve the external storage volume setup problem.
Would it be possible to provide a working Dockerfile? There seems to be one, but it seems outdated compared to the bootstrap.sh install. A Dockerfile could make the environment replicatable on ec2 as well as local (no need for a Vagrant VM).
I tried installing the environment on ec2 with the provided scripts, but some things didn't install, and there were no logs created.