laurilehmijoki / s3_website

Manage an S3 website: sync, deliver via CloudFront, benefit from advanced S3 website features.
Other
2.25k stars 187 forks source link

When attempting to push just changed files, get a failure with no detail #215

Closed petemounce closed 8 years ago

petemounce commented 8 years ago

I've started to use s3_website - thanks for writing it!

I have a problem, though. The initial push went fine. However, now when I've made an edit to a file that I've pushed and try to push again, I get a failure. This happens both with and without --dry-run option.

Could the log output please be made verbose so that one can see the underlying API calls being made for situations like these?

I think my only workaround is to --force push every time, now - that makes my workflow much longer (and causes AWS to charge me more for the traffic).

I've attached the versions of the software that I'm using, the config file minus secrets, the log output, and the IAM permission set applied to the user.

software

ruby: 2.3.0 s3_website: 2.12.3

config file

.aws/s3_website.yml (with identifying bits and secrets changed):

s3_id: <my access key>
s3_secret: <my secret>
s3_bucket: domain.com 

# Below are examples of all the available configurations.
# See README for more detailed info on each of them.

# site: path-to-your-website
site: /home/foo/webfiles

index_document: index.html
error_document: error.html

max_age:
  "~foo/css/*": 604800
  "~foo/diary/*": 300
  "*": 900

gzip:
  - .html
  - .css
  - .js
  - .md

# gzip_zopfli: true

# See http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region for valid endpoints
# s3_endpoint: ap-northeast-1
s3_endpoint: eu-west-1

# exclude_from_upload:
#   - those_folders_of_stuff
#   - i_wouldnt_want_to_upload

# s3_reduced_redundancy: true

# cloudfront_distribution_id: your-dist-id

# cloudfront_distribution_config:
#   default_cache_behavior:
#     min_TTL: <%= 60 * 60 * 24 %>
#   aliases:
#     quantity: 1
#     items:
#       CNAME: your.website.com

# cloudfront_invalidate_root: true

concurrency_level: 3

# redirects:
#   index.php: /
#   about.php: about.html
#   music-files/promo.mp4: http://www.youtube.com/watch?v=dQw4w9WgXcQ

# routing_rules:
#   - condition:
#       key_prefix_equals: blog/some_path
#     redirect:
#       host_name: blog.example.com
#       replace_key_prefix_with: some_new_path/
#       http_redirect_code: 301

log output:

foo@bar ~ $ s3_website push --config-dir ~/.aws --verbose --site ~/webfiles/
[debg] Using /home/foo/.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/s3_website-2.12.3/s3_website-2.12.3.jar
[info] Deploying /home/foo/webfiles/* to domain.com
[debg] Querying S3 files
[debg] Querying more S3 files (starting from ~foo/diary/cpix/2009q1/day838a_thumb.jpg)
[debg] Querying more S3 files (starting from ~foo/diary/images/2008q1/day476e.jpg)
[debg] Querying more S3 files (starting from ~foo/diary/images/2009q4/day1076c.jpg)
[debg] Querying more S3 files (starting from ~foo/diary/images/2011q2/day1660a_large.jpg)
[debg] Querying more S3 files (starting from ~foo/diary/images/2013q2/day2346b.gif)
[debg] Querying more S3 files (starting from ~foo/diary/images/2014q4/day2974bt2.jpg)
[debg] Querying more S3 files (starting from ~foo/diary/published/200612/R_day46.html)
[debg] Querying more S3 files (starting from ~foo/diary/published/200904/R_day897.html)
[debg] Querying more S3 files (starting from ~foo/diary/published/201108/R_day1747.html)
[debg] Querying more S3 files (starting from ~foo/diary/published/201401/R_day2637.html)
[debg] Querying more S3 files (starting from ~foo/help/index.html)
[info] Summary: 1 operation failed.
[fail] Failed to push the website to http://domain.com.s3-website-eu-west-1.amazonaws.com

IAM permission set that the user has:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "acm:DescribeCertificate",
        "acm:GetCertificate", 
        "acm:ListCertificates",
        "appstream:Get*",
        "autoscaling:Describe*",
        "cloudformation:DescribeStackEvents",
        "cloudformation:DescribeStackResource",
        "cloudformation:DescribeStackResources",
        "cloudformation:DescribeStacks",
        "cloudformation:GetTemplate",
        "cloudformation:List*",
        "cloudfront:Get*",
        "cloudfront:List*",
        "cloudsearch:Describe*",
        "cloudsearch:List*",
        "cloudtrail:DescribeTrails",
        "cloudtrail:GetTrailStatus",
        "cloudwatch:Describe*",
        "cloudwatch:Get*",
        "cloudwatch:List*",
        "codecommit:BatchGetRepositories",
        "codecommit:Get*",
        "codecommit:GitPull",
        "codecommit:List*", 
        "codedeploy:Batch*",
        "codedeploy:Get*",
        "codedeploy:List*",
        "config:Deliver*",
        "config:Describe*",
        "config:Get*",
        "datapipeline:DescribeObjects",
        "datapipeline:DescribePipelines",
        "datapipeline:EvaluateExpression",
        "datapipeline:GetPipelineDefinition",
        "datapipeline:ListPipelines",
        "datapipeline:QueryObjects",
        "datapipeline:ValidatePipelineDefinition",
        "directconnect:Describe*",
        "ds:Check*",
        "ds:Describe*",
        "ds:Get*",
        "ds:List*",
        "ds:Verify*",
        "dynamodb:BatchGetItem",
        "dynamodb:DescribeTable",
        "dynamodb:GetItem",
        "dynamodb:ListTables",
        "dynamodb:Query",
        "dynamodb:Scan",
        "ec2:Describe*",
        "ec2:GetConsoleOutput",
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetManifest",
        "ecr:DescribeRepositories",
        "ecr:ListImages",
        "ecr:BatchGetImage",
        "ecs:Describe*",
        "ecs:List*",
        "elasticache:Describe*",
        "elasticache:List*",
        "elasticbeanstalk:Check*",
        "elasticbeanstalk:Describe*",
        "elasticbeanstalk:List*",
        "elasticbeanstalk:RequestEnvironmentInfo",
        "elasticbeanstalk:RetrieveEnvironmentInfo",
        "elasticloadbalancing:Describe*",
        "elasticmapreduce:Describe*",
        "elasticmapreduce:List*",
        "elastictranscoder:List*",
        "elastictranscoder:Read*",
        "es:DescribeElasticsearchDomain",
        "es:DescribeElasticsearchDomains",
        "es:DescribeElasticsearchDomainConfig",
        "es:ListDomainNames",
        "es:ListTags",
        "es:ESHttpGet",
        "es:ESHttpHead",
        "events:DescribeRule",
        "events:ListRuleNamesByTarget",
        "events:ListRules",
        "events:ListTargetsByRule",
        "events:TestEventPattern",
        "firehose:Describe*",
        "firehose:List*",
        "glacier:ListVaults", 
        "glacier:DescribeVault", 
        "glacier:GetDataRetrievalPolicy",
        "glacier:GetVaultAccessPolicy",
        "glacier:GetVaultLock",
        "glacier:GetVaultNotifications",
        "glacier:ListJobs", 
        "glacier:ListMultipartUploads",
        "glacier:ListParts",
        "glacier:ListTagsForVault", 
        "glacier:DescribeJob", 
        "glacier:GetJobOutput", 
        "iam:GenerateCredentialReport",
        "iam:Get*",
        "iam:List*",
        "inspector:Describe*",
        "inspector:Get*",
        "inspector:List*",
        "inspector:LocalizeText",
        "inspector:PreviewAgentsForResourceGroup",
        "iot:Describe*",
        "iot:Get*",
        "iot:List*", 
        "kinesis:Describe*",
        "kinesis:Get*",
        "kinesis:List*",
        "kms:Describe*",
        "kms:Get*",
        "kms:List*",
        "lambda:List*",
        "lambda:Get*", 
        "logs:Describe*",
        "logs:Get*",
        "logs:TestMetricFilter",
        "machinelearning:Describe*",
        "machinelearning:Get*",
        "mobilehub:GetProject",
        "mobilehub:ListAvailableFeatures",
        "mobilehub:ListAvailableRegions",
        "mobilehub:ListProjects",
        "mobilehub:ValidateProject",
        "mobilehub:VerifyServiceRole",
        "opsworks:Describe*",
        "opsworks:Get*",
        "rds:Describe*",
        "rds:ListTagsForResource",
        "redshift:Describe*",
        "redshift:ViewQueriesInConsole",
        "route53:Get*",
        "route53:List*",
        "route53domains:CheckDomainAvailability",
        "route53domains:GetDomainDetail",
        "route53domains:GetOperationDetail",
        "route53domains:ListDomains",
        "route53domains:ListOperations",
        "route53domains:ListTagsForDomain",
        "s3:Get*",
        "s3:List*",
        "sdb:GetAttributes",
        "sdb:List*",
        "sdb:Select*",
        "ses:Get*",
        "ses:List*",
        "sns:Get*",
        "sns:List*",
        "sqs:GetQueueAttributes",
        "sqs:ListQueues",
        "sqs:ReceiveMessage",
        "storagegateway:Describe*",
        "storagegateway:List*",
        "swf:Count*",
        "swf:Describe*",
        "swf:Get*",
        "swf:List*",
        "tag:Get*",
        "trustedadvisor:Describe*",
        "waf:Get*",
        "waf:List*",
        "workspaces:Describe*"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

{
  "Statement":[{
    "Resource":["arn:aws:s3:::*"],
    "Action":["s3:GetBucketLocation","s3:ListAllMyBuckets"],
    "Effect":"Allow"
  },{
    "Resource":["arn:aws:s3:::domain.com"],
      "Action":["s3:ListBucket"],
      "Effect":"Allow"
  },{
    "Resource":["arn:aws:s3:::domain.com/*"],
    "Action": [
      "s3:DeleteObject",
      "s3:GetObject",
      "s3:PutObject"
    ],
    "Sid":"AllowS3WebsiteThings",
    "Effect":"Allow"
  }],
  "Version":"2012-10-17"
}
petemounce commented 8 years ago

I've just adjusted the IAM permissions to allow s3:* on the bucket (as below) but I get the same log output.

{"Statement":[{"Resource":["arn:aws:s3:::domain.com","arn:aws:s3:::domain.com/","arn:aws:s3:::domain.com/*"],"Action":["s3:*"],"Effect":"Allow"},{"Resource":["arn:aws:s3:::*"],"Action":["s3:GetBucketLocation","s3:ListAllMyBuckets"],"Effect":"Allow"},{"Resource":["arn:aws:s3:::domain.com"],"Action":["s3:ListBucket"],"Effect":"Allow"},{"Resource":["arn:aws:s3:::domain.com/*"],"Action":["s3:DeleteObject","s3:GetObject","s3:PutObject"],"Sid":"AllowS3WebsiteThings","Effect":"Allow"}],"Version":"2012-10-17"}
petemounce commented 8 years ago

Should I, even though I'm not using it, give cloudfront permissions?

laurilehmijoki commented 8 years ago

If you have not defined a CloudFront distribution id in your config file, you don't need to declare any CF permissions.

laurilehmijoki commented 8 years ago

The output log inplies that you have thousands of files on your S3 bucket. I wonder if that has something to do with the error. s3_website should support an infinite amount of files. It should also echo AWS permission errors if it encounters any. I'm out of good guesses here.

laurilehmijoki commented 8 years ago

@petemounce I just added a debugging guide. It contains instructions on how to log the AWS operations.

petemounce commented 8 years ago

Thanks @laurilehmijoki - I'll try that out and get back to you. There are indeed thousands of files in the bucket - small files of static html and images accumulated over around 9 years of daily blogging.

petemounce commented 8 years ago

I've tried that, and now I get output like

> bundle exec ruby ./bin/s3_website cfg apply --no-autocreate-cloudfront-dist --config-dir <correct config dir path>
Applying the configurations in s3_website.yml on the AWS services ...
Bucket <correct bucket> now functions as a website
AWS API call failed:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>[some request id]</RequestId><HostId>[some host id]</HostId></Error> (GenericError)

I've tried a src/main/resources/log4j.properties with contents:

log4j.rootLogger=DEBUG, A1
log4j.appender.A1=org.apache.log4j.ConsoleAppender
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%d [%t] %-5p %c -  %m%n
# Or you can explicitly enable WARN and ERROR messages for  the AWS Java clients
log4j.logger.com.amazonaws=DEBUG
log4j.logger.com.amazonaws.request=DEBUG
log4j.logger.org.apache.http.wire=DEBUG

and nothing makes it output more.

I've done this in both a clone of this repo and within <ruby install's gems directory>/src/main/resources/log4j.properties - neither edits make a difference.

I haven't tried print in the scala - wouldn't really know where to start. I had a look to see if I could find the creation of AWS clients, but so far haven't succeeded.

petemounce commented 8 years ago

The IAM in effect (in addition to the out of the box reader policy):


    "AuthorPolicy": {
      "Type": "AWS::IAM::ManagedPolicy",
      "Properties": {
        "Description": "Website Authors",
        "Roles": [
          {
            "Ref": "AuthorsRole"
          }
        ],
        "Groups": [
          {"Ref":"AuthorsGroup"}
        ],
        "PolicyDocument": {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Effect": "Allow",
              "Action": [
                "s3:GetBucketLocation",
                "s3:ListAllMyBuckets"
              ],
              "Resource": [
                {
                  "Fn::Join": [
                    "",
                    [
                      "arn:aws:s3:::*"
                    ]
                  ]
                }
              ]
            },
            {
              "Sid": "FindBuckets",
              "Effect": "Allow",
              "Action": [
                "s3:ListBucket"
              ],
              "Resource": [
                {
                  "Fn::Join": [
                    "",
                    ["arn:aws:s3:::","www.",{"Ref": "ApexDomain"}]
                  ]
                },
                {
                  "Fn::Join": [
                    "",
                    ["arn:aws:s3:::",{"Ref": "LogsBucketName"}]
                  ]
                }
              ]
            },
            {
              "Sid": "AllowS3WebsiteAdmin",
              "Effect": "Allow",
              "Action": [
                "s3:DeleteObject",
                "s3:GetObject",
                "s3:PutBucketWebsite",
                "s3:PutObject",
                "s3:PutObjectAcl"
              ],
              "Resource": [
                {
                  "Fn::Join": [
                    "",
                    ["arn:aws:s3:::","www.",{"Ref": "ApexDomain"}]
                  ]
                },
                {
                  "Fn::Join": [
                    "",
                    ["arn:aws:s3:::","www.",{"Ref": "ApexDomain"},"/*"]
                  ]
                }
              ]
            },
            {
              "Sid": "AllowS3WebsiteLogsRetrieval",
              "Effect": "Allow",
              "Action": [
                "s3:GetObject"
              ],
              "Resource": [
                {
                  "Fn::Join": [
                    "",
                    [
                      "arn:aws:s3:::",
                      {
                        "Ref": "LogsBucketName"
                      },
                      "/logs/static/",
                      "www.",
                      {
                        "Ref": "ApexDomain"
                      },
                      "/s3-access/*"
                    ]
                  ]
                }
              ]
            }
          ]
        }
      }
    }
laurilehmijoki commented 8 years ago

Here s3_website resolves the objects on your S3 bucket. Here is the line that prints [debg] Querying more S3 files into console.

Try adding print(reports) to https://github.com/laurilehmijoki/s3_website/blob/100e2195f9deefbaf4a8382f63818bfc0f6dcfea/src/main/scala/s3/website/Push.scala#L105 – see anything suspicious?

Any chance you could share the website data with me? I could try to debug it locally on my machine.

laurilehmijoki commented 8 years ago

Closing as inactive. Please reopen if needed.