rubyonjets / jets

Ruby on Jets
http://rubyonjets.com
MIT License
2.6k stars 181 forks source link

Custom domain unexpectedly gets destroyed when deploying #551

Closed HannesBenson closed 3 years ago

HannesBenson commented 3 years ago

Checklist

My Environment

Software Version
Operating System Ubuntu Linux
Jets 3.0.2/3.0.5
Ruby 2.7.0

Expected Behaviour

When I have the following set in my stage config:

  config.domain.hosted_zone_name = 'my.cool-services.bar'
  config.domain.name = 'staging.my.cool-services.bar'
  config.domain.cert_arn = "arn:aws:acm:us-east-1:xxxxxxxxxxxxx:certificate/yyyyyyyyy-zzzz-aaaa-bbbb-dddddddddd"
  config.domain.route53 = true

and I deploy my jets app to lambda using jets deploy staging, I expect the DNS record to always be present and pointing at the latest stack.

Current Behavior

When I use the described configuration and I deploy my jets app to lambda using jets deploy staging, the DNS record is removed if it exists and it is created when it was missing. This requires every deploy to be done twice to ensure the service is working as expected.

Step-by-step reproduction instructions

  1. Create a new jets app
  2. Set the appropriate config for the staging/production environment
  3. Deploy it, if there is no custom dns in AWS, the deploy will go out and work as expected. If the DNS record existed, this will be removed.

Code Sample

Looking at the code from 2.3.x here we can see that it added the custom domain if it was configured for jets. However, in 3.0.x here we see that it checks if the domain already exists. This then leads to CloudFormation templates differing between the 2 deploys. When the domain does exist:

---
Resources:
  RestApi:
    Type: AWS::ApiGateway::RestApi
    Properties:
      Name: jets-stag
      EndpointConfiguration:
        Types:
          - EDGE
      BinaryMediaTypes:
        - multipart/form-data
Outputs:
  RestApi:
    Value: !Ref RestApi
  Region:
    Value: !Ref AWS::Region
  RootResourceId:
    Value: !GetAtt RestApi.RootResourceId
  RestApiUrl:
    Value: !Sub "https://${RestApi}.execute-api.${AWS::Region}.amazonaws.com/stag/"
  DomainName:
    Value: staging.my.cool-service.bar

When the domain doesn't exist:

---
Resources:
  RestApi:
    Type: AWS::ApiGateway::RestApi
    Properties:
      Name: jets-stag
      EndpointConfiguration:
        Types:
          - EDGE
      BinaryMediaTypes:
        - multipart/form-data
  DomainName:
    Type: AWS::ApiGateway::DomainName
    Properties:
      DomainName: staging.my.cool-service.bar
      EndpointConfiguration:
        Types:
          - REGIONAL
      RegionalCertificateArn: certificate-arn-here
  DnsRecord:
    Type: AWS::Route53::RecordSet
    Properties:
      Comment: DNS record managed by Jets
      Name: staging.my.cool-service.bar
      HostedZoneName: my.cool-service.bar.
      Type: CNAME
      TTL: "60"
      ResourceRecords:
        - !GetAtt DomainName.RegionalDomainName
Outputs:
  RestApi:
    Value: !Ref RestApi
  Region:
    Value: !Ref AWS::Region
  RootResourceId:
    Value: !GetAtt RestApi.RootResourceId
  RestApiUrl:
    Value: !Sub "https://${RestApi}.execute-api.${AWS::Region}.amazonaws.com/stag/"
  DomainName:
    Value: !Ref DomainName
  DnsRecord:
    Value: !Ref DnsRecord

Solution Suggestion

Stop checking if the record exists in AWS already as it breaks the documented feature.

tsribeiro commented 3 years ago

Hi, how are you?

I also use this same setting, inside the architecture composed by N micro services and didn’t occur the same problem. My DNS register is created at the same hour that the Custom Domain is created. Once they have been already created, the DNS register is not added to the Cloud Front Templates, the one that originated the DNS Register.

Your DBS Register was created outside the Stack of the Cloud Front?

Regards

cattruscott commented 3 years ago

Just to clarify, what do you mean by "My DNS register"? The Hosted Zone or the Record within the Hosted Zone?

Both the Record in the Hosted Zone and the Custom Domain Name in the Api Gateway were created via cloudformation templates produced by Jets deploys.

What we are seeing post 3.x is that if they exist, the next deploy produces CloudFormation Templates do not include DomainName and DnsRecord. And if those resources don't exist then the deploy creates the Cloudformation Templates including the resources. If we deploy twice in a row we end up with a correctly built stack (first time removes the resources, second time adds them back) which is not ideal.

HannesBenson commented 3 years ago

I've now gone ahead and reverted the jets version to 2.3 to compare the templates that used to be generated vs what we're seeing now. In 2.3 we always get exactly what we get when the domain name doesn't exist (in v3).

When we use 3.x though the difference I see comes in with:

   DomainName:
-    Value: !Ref DomainName
-  DnsRecord:
-    Value: !Ref DnsRecord
+    Value: staging.my.cool-service.bar

and the complete removal of the DomainName & DnsRecord resources. My CFN knowledge is definitelly not up to scratch, but what I think this is supposed to mean is that the DomainName with the value of staging.my.cool-service.bar should be used for the api-gateway. If my assumption is correct, why is our DnsRecord being deleted? Should there be an entry for that similar to how the DomainName now has a value?

HannesBenson commented 3 years ago

So after hour and more hours of digging through everything we found the issue. We're running a very tight IAM policy for our services and in this case we were missing the cloudformation:DescribeStackResource & cloudformation:DescribeStackResources permissions. The sad part about this is that no errors were raised, it just magically deleted things that we needed.

I'm closing this issue and hoping no-one ever has to read it....

JackKelly-Bellroy commented 3 years ago

Remark: cloudformation:DescribeStackResource should be added to the IAM policy example at https://rubyonjets.com/docs/extras/minimal-deploy-iam/cli/ , and the rescue from every exception in places like the below function may be too broad:

https://github.com/boltops-tools/jets/blob/c0ac4723b04c54ffd14fab5757ed640f0295ae15/lib/jets/cfn/builders/api_gateway_builder.rb#L79-L87

Probably better to loudly fail instead of silently doing unexpected things.