linz / geospatial-data-lake-team

For Geospatial Data Lake team management
0 stars 0 forks source link

Implement copy data/metadata functionality #69

Closed JoCaudwell closed 3 years ago

JoCaudwell commented 3 years ago

User Story

So that the data is stored in the Data Lake, as a Data Maintainer, I want the data lake to copy data and metadata from it's current S3 source to the Data Lake at the end of the import process.

Acceptance Criteria

Additional context

Potential tasks from planning discussion

Definition of Ready

Definition of Done

Subtasks

imincik commented 3 years ago

@MitchellPaff , see

S3 batch copy IAM role policy template and IAM trust policy example

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:PutObjectTagging",
                "s3:PutObjectLegalHold",
                "s3:PutObjectRetention",
                "s3:GetBucketObjectLockConfiguration"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::imincik-test2/*"
        },
        {
            "Action": [
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:GetObjectTagging"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::{{SourceBucket}}/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::imincik-test2/manifest.csv"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::imincik-test2/aaa/*"
            ]
        }
    ]
}

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "batchoperations.s3.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
billgeo commented 3 years ago

Tested by importing multiple files and works as expected. It would be good to make sure there is at least one asset listed and I believe there is a ticket already open for that.