amplab / training

Training materials for Strata, AMP Camp, etc
150 stars 121 forks source link

AMIs for AWS Europe #146

Closed michaelpisula closed 10 years ago

michaelpisula commented 10 years ago

Hi, Sorry if this is not the correct place for this question, but I found no way to contact you guys from the ampcamp page. I attended the BDAS session at Strataconf and was very intrigued by Spark and found the ampcamp exercises to be great. Next week my company will have a retreat, where we will have several coding sessions, and I would like to go through the Spark exercises with my colleagues. While trying to set up a Cluster I realized that the AMI is only available in the US regions of EC2. Would it be possible to copy the AMIs to the European region? Using the US region is a valid fallback, but I would prefer the closer Datacenters to keep latency down.

Also I am mainly interested in the Machine Learning Exercises, does it make sense to disable loading of the wikipedia data into the cluster? Can I use fewer instances if I only concentrate on the ML exercise?

Cheers, Michael

shivaram commented 10 years ago

You can clone the AMP Camp AMI to the Europe region using the EC2 API - http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/CopyingAMIs.html . It might also make sense to copy the S3 data to Europe.

Which exercises do you want to to use ? Are you referring to the machine learning exercises at http://ampcamp.berkeley.edu/4/exercises/movie-recommendation-with-mllib.html ?

michaelpisula commented 10 years ago

Yes, I wanted to use the movie recommendation with mlib exercise. The introduction to spark exercise would probably also make sense.

As for copying the AMI, the frist sentence on the AWS Doc page you sent me says that one can only copy AMI one owns. Right now I cannot find the AMI, but that is probably a problem with my permissions, as some other things do not work as well.

shivaram commented 10 years ago

Ah I see -- Do you happen to know the AMI ID used for AMP Camp 4 ? If we know that I can help make the copy to Europe.

cc @pwendell

michaelpisula commented 10 years ago

I looked into the scripts that starts the cluster, and found the following link: http://s3.amazonaws.com/ampcamp-amis/latest-ampcamp3

This is the AMI i got from following the link. ami-19474270

I tried using ampcamp4 instead of ampcamp3, but that did not work, so I suppose the AMI did not change between the two ampcamps.

Thanks a lot for your help :+1:

shivaram commented 10 years ago

Thanks for looking up the AMI ID -- Unfortunately I just figured out that I don't have access to the account that owns the AMI -- I think the AMI is owned by @pwendell, so you'll need to check with him on copying this.

michaelpisula commented 10 years ago

Thanks Shivaram, I sent Patrick a mail.

I will close this Issue, thanks a lot for your help.