EPiCs-group / obelix

An automated workflow for generation & analysis of bidentate ligand containing complexes
GNU General Public License v3.0
0 stars 1 forks source link

AWS S3-dynamoDB testing from DCC account #26

Open Selkubi opened 4 months ago

Selkubi commented 4 months ago

Here is how to create the necessary connections between S3, DynamoDB and the lambda function

Step 1: Create an S3 Bucket

  1. Go to the S3 console:

  2. Create a bucket:

    • Click "Create bucket".
    • Provide a unique bucket name (e.g., your-bucket-name).
    • Select the desired region (e.g., eu-central-1).
    • Click "Create bucket".

Step 2: Create an SNS Topic

This is a notification tool necessary to receive the experiment_id emails.

  1. Go to the SNS console:

  2. Create a topic:

    • Click "Create topic".
    • Choose "Standard" for the topic type.
    • Provide a name (e.g., obelix_test).
    • Click "Create topic".
  3. Create a subscription:

    • Click on the topic ARN to open the topic details.
    • Click "Create subscription".
    • Choose "Email" for the protocol.
    • Enter the email address you want notifications sent to.
    • Click "Create subscription".
    • Check your email and confirm the subscription.

Step 3: Create a DynamoDB Table

  1. Go to the DynamoDB console:

  2. Create a table:

    • Click "Create table".
    • Provide a table name (e.g., obelix_test-empty).
    • Set the primary key (Exp_ID) as a string or number depending on your use case.
    • Click "Create table".

Step 4: Create an IAM Role for Lambda

This was necessary to work within the DCC sandbox. I am not sure how this will work like in the Obelix AWS account.

  1. Go to the IAM console:

    • In the AWS Management Console, search for and select "IAM".
  2. Create a role:

    • Click "Roles" in the sidebar, then "Create role".
    • Choose "Lambda" as the trusted entity type.
    • Click "Next: Permissions".
  3. Attach policies:

    • Attach the following policies:
      • AmazonS3ReadOnlyAccess
      • AmazonDynamoDBFullAccess
      • AmazonSNSFullAccess
    • Click "Next: Tags", then "Next: Review".
  4. Name the role:

    • Provide a role name (e.g., lambda-s3-dynamodb-sns-role).
    • Click "Create role".

Step 5: Create the Lambda Function

  1. Go to the Lambda console:

    • In the AWS Management Console, search for and select "Lambda".
  2. Create a function:

    • Click "Create function".
    • Choose "Author from scratch".
    • Provide a name (e.g., s3-to-dynamodb-sns).
    • Select "Python 3.8" or later as the runtime.
    • Under "Permissions", choose "Use an existing role".
    • Select the role you created earlier (lambda-s3-dynamodb-sns-role).
    • Click "Create function".
  3. Configure the function:

    • Copy and paste your Lambda function code into the editor.
    • Click "Deploy".
    • When a stable version of the Lambda function is created, make sure to click actions --> Publish new version such that the current version of the Lambda functions gets versioned.
  4. Add environment variables:

    • In the "Configuration" tab, select "Environment variables".
    • Add the following environment variables:
      • SNS_TOPIC_ARN with the value of your SNS topic ARN.

Step 6: Set up the S3 Trigger

  1. Add a trigger:
    • In the Lambda function configuration, click "Add trigger".
    • Choose "S3" as the trigger.
    • Select your S3 bucket.
    • For "Event type", choose "All object create events".
    • Click "Add".

Step 7: Configure the bucket policy

To give the lambda function access to the s3 bucket you're using, make sure that the S3 bucket has the right policy. eg In this code, he principal/AWS field comes from the IAM of the 'role' that you have created. The ARN is shown there. { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::058264498638:role/lambda-s3-dynamodb-sns-role-obelix" }, "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::s3-test-bucket-obelix", "arn:aws:s3:::s3-test-bucket-obelix/*" ] } ] } Make sure to change the princible and the bucket names are correct

Step 8: Test the Function

  1. Upload a test file:

    • Upload a CSV file to your S3 bucket.
  2. Check the results:

    • Verify that the file is processed, the new column is added, and the data is written to DynamoDB.
    • Check your email for the SNS notification.
Selkubi commented 2 months ago

We have decided to go for a more comprehensive cloud architecture with 2 lambda functions that puts data in the dynamoDB table. The architecture looks like the below image from Magno

Image

Selkubi commented 2 months ago

There is a problem with the access permission given to the services (in the pipeline written above). I am trying to find what the problem is and update the workflow accordingly.