rcyeh / cfem2013

Cornell Financial Engineering Manhattan 2013 Project
1 stars 1 forks source link

Register for Amazon Web Services accounts and test access to data #1

Closed rcyeh closed 11 years ago

rcyeh commented 11 years ago

Please elect one person from the team to do this step. This person shall be responsible for documenting all the steps we need to do, so that others may follow the protocol without the frustration.

The ideal person should be in possession of a computer and access to the internet; and a credit card, to complete step #1. I need to hear from every one of you as to whether that is a barrier. Also, ideally, this person shall be the person in your group who is interested in but least comfortable with using computer services.

After completion of this step, we will have a recipe for accessing the data, and this first person will have gained about 3 hours of experience making initial steps toward using Amazon Web Services for cloud computing.

  1. Please go to < http://aws.amazon.com/ > and register for an account, if you have not done so already. You will want to be able to use the EC2 service in region "us-east-1".
  2. Launch a micro instance of a free tier-eligible operating system. I recommend AMI or Ubuntu. During the process, create and download an SSH key.
  3. If you do not yet have an SSH client and are running Windows, get one. I use PuTTY.
  4. In the AWS management console, find the hostname and use SSH to connect to it. My command-line was:

ssh -i .ssh/rcy_amzn_ec2_cfem.pem ec2-user@ec2-184-72-135-160.compute-1.amazonaws.com

  1. In the Amazon cloud shell, obtain a copy of s3cmd from < http://s3tools.org/download >.

curl http://iweb.dl.sourceforge.net/project/s3tools/s3cmd/1.5.0-alpha1/s3cmd-1.5.0-alpha1.tar.gz > s3cmd-1.5.0-alpha1.tar.gz

  1. Unpack:

tar xvzf s3cmd-1.5.0-alpha1.tar.gz

  1. Configure

s3cmd-1.5.0/s3cmd --configure

  1. Try to list a file:

Type: s3cmd-1.5.0-alpha1/s3cmd ls s3://cfem2013/ticks.20130423.1.csv.gz

Check if you see the following:

2013-05-08 21:20 3316624312 s3://cfem2013/ticks.20130423.1.csv.gz

If not, I will probably need to adjust permissions. Copy the messages you see into this issue, and we'll try to figure them out.

zc238 commented 11 years ago

Hi Richard,

Perhaps you need to adjust permissions, I got the following message:

ubuntu@ip-10-245-102-82:~/s3cmd-1.5.0-alpha1$ s3cmd ls s3://cfem2013/ticks.20130423.1.csv.gz ERROR: Access to bucket 'cfem2013' was denied

Thanks, Zhenyu

rcyeh commented 11 years ago

Thanks! It turns out my last message was wrong. Instead of the e-mail address associated with your AWS account, I need the AWS account number. When you first log in to AWS at < https://portal.aws.amazon.com/gp/aws/manageYourAccount >, it shows up on the right-hand side:

aws_account_number

zc238 commented 11 years ago

Thanks. My account number is 8023-9057-5354.

rcyeh commented 11 years ago

I have now set the bucket policy to allow your account to get and list the contents of this bucket. Please try the s3cmd ls or s3cmd get command again.

zc238 commented 11 years ago

Hi Richard,

Somehow it is still not allowing access:

ubuntu@ip-10-202-21-163:~/s3cmd-1.5.0-alpha1$ s3cmd get s3://cfem2013/ticks.20130423.1.csv.gz s3://cfem2013/ticks.20130423.1.csv.gz -> ./ticks.20130423.1.csv.gz [1 of 1] ERROR: S3 error: 403 (Forbidden):

ubuntu@ip-10-202-21-163:~/s3cmd-1.5.0-alpha1$ s3cmd ls s3://cfem2013/ticks.20130423.1.csv.gz ERROR: Access to bucket 'cfem2013' was denied ubuntu@ip-10-202-21-163:~/s3cmd-1.5.0-alpha1$ Is it possible for us to have a look of the bucket policy file? Perhaps we can just setup a new instance, and use your bucket policy to replicate this problem on our side. Alternatively I'm thinking is it possible for you to create some accounts on your sc2 instance for us, so we can perhaps circumvent this step?

Thanks so much, Zhenyu

rcyeh commented 11 years ago

Here's the bucket policy: { "Version": "2008-10-17", "Id": "Policy1369080342384", "Statement": [ { "Sid": "Stmt1369080340704", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::802390575354:root" }, "Action": "s3:Get", "Resource": "arn:aws:s3:::cfem2013/" }, { "Sid": "Stmt1369080476187", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::802390575354:root" }, "Action": "s3:List*", "Resource": "arn:aws:s3:::cfem2013" } ] }

zc238 commented 11 years ago

From the look of it, I'm guessing you are enabling the root account get access to s3, but Ubuntu system does not allow root account by default due to safety issues (and I have been using account name ubuntu instead of root), may we quickly test one thing? Just change root to ubuntu instead, and it should work.

After that, if you can allow all accounts access under my Amazon account 80239057554, it may be even better as I created different users accounts for the other team members, so they don't need to login using private keys.

Thanks so much for your help, Zhenyu

rcyeh commented 11 years ago

I think there is a different problem --- I enabled a feature called "Requester Pays" on this bucket. (The "root" in the bucket policy would apply no matter what machine image you used, or even if you weren't using EC2.) Try the following command:

s3cmd --add-header="x-amz-request-payer:requester" ls s3://cfem2013/ticks.20130423.1.csv.gz

(The total number of s3 requests you're going to make is much lower than your free tier threshold.)

zc238 commented 11 years ago

Argh, I see.

I tried the command:

ubuntu@ip-10-202-21-163:~$ s3cmd --add-header="x-amz-request-payer:requester" ls s3://cfem2013/ticks.20130423.1.csv.gz ERROR: Access to bucket 'cfem2013' was denied not sure why, if this does not work, we can replicate the problem on our side tomorrow and figure it out.

rcyeh commented 11 years ago

I turned off "Requester Pays". See if that helps?

zc238 commented 11 years ago

That worked, thanks.

rcyeh commented 11 years ago

If your personal computer is faster than the Amazon micro instance (and there's a good chance it may be), feel free to download a tick file to your personal computer and analyze it there. The point of the initial testing-things-out stage is to do rapid iteration, which won't work if the micro instance in the bottleneck.