Closed Offlein closed 1 year ago
Hi @Offlein, just to confirm, you had success using the key security-group
? That's surprising to me; I worry that yaml is swallowing your SG config instead of parsing it.
Can you help me understand by showing me the manifest snippet which resulted in the error? That would help a lot with reproduction and troubleshooting.
Oh, I think I see what's happening here. It looks like you're trying to pass a security group ID to network.vpc.security_group
in the environment manifest.
The environment manifest's security_group
field is only specifiable as a map. It lets you customize the security group rules for the copilot-managed security group that we create with every environment.
type: Environment
name: prod
network:
vpc:
security_group:
ingress:
- ip_protocol: tcp
ports: 80
cidr: 10.0.1.0/24
egress:
- ip_protocol: tcp
ports: 80
cidr: 0.0.0.0/0
If you want to attach additional security groups to your services or jobs, you can specify those as SG IDs in the network.vpc.security_groups
field, like so:
type: Backend Service
name: be
network:
vpc:
security_groups: [sg-000001]
This is actually pretty confusing, and I apologize. We should probably do a better job of explaining this in our environment docs.
@bvtujo Thanks for the thoughtful reply. I believe your understanding is correct. But I'm not clear why it ever worked in that case. (Except that maybe I went in via the Web UI and somehow overrode this?)
To be clear, there was a typo that it looks like you've fixed (#5275) about security-group
instead of security_group
.
And I believe, looking back, I can see how my initial report could be super confusing as well.
...In my environment manifest.yml, I was setting the network.vpc.security_group.egress
value to an array of security group IDs:
network:
vpc:
id: vpc-xxxxx
subnets:
public:
- id: subnet-aaaa
- id: subnet-bbbb
private:
- id: subnet-cccc
- id: subnet-dddd
security-group:
egress: [ sg-abcdabcdabcd ]
It's because the documentation claims it is an Array of Security Group Rules
and it was not clear to me what a "Security Group Rule" was.
(I can infer it is something in the structure:
- ip_protocol: tcp
ports: 80
cidr: 0.0.0.0/0
although I somehow didn't infer this at the time.)
I will set the VPC security group IDs per your recommendation and see how that goes.
As an aside, I feel like I am experiencing lots of issues during this process, and it's exacerbated by (1) the fact that each deploy takes ~4-5 minutes if I'm very lucky, but usually closer to 8-12 minutes, and then once it's up I have no means of really inspecting what's running. I almost feel like maybe it'd be worthwhile to build an SSH server into my container and see if that helps, at least while testing?
I see, that makes sense. Currently Copilot doesn't let you specify security group IDs in the vpc security group egress rules, but you could do so by using copilot env override
and yamlpatch.
For example, with the following manifest:
type: Environment
name: prod
You could run copilot env override
, go through the prompts, then use the following patch to set things up:
with the following ingress/egress rules:
- op: add
path: /Resources/EnvironmentSecurityGroup/Properties/SecurityGroupIngress
value:
- SourceSecurityGroupId: sg-12345
IpProtocol: tcp,
FromPort: 80
ToPort: 80
CidrIp: 10.0.1.0/24
This would result in the right security group ID being applied to your rules, but has the downside of requiring you to write raw CFN.
You could instead do
type: Environment
name: prod
network:
vpc:
security_group:
ingress:
- ip_protocol: tcp
ports: 80
cidr: 10.0.1.0/24
and the following, more specific patch:
- op: add
path: /Resources/EnvironmentSecurityGroup/Properties/SecurityGroupIngress/0/SourceSecurityGroupId
value: sg-12345
To help ease your deployment woes, I have a couple of answers for you.
copilot svc exec
, which uses the SSM agent to set up a secure shell session inside your running containers. You can check out the raw CFN which will be deployed by running copilot env package --output-dir env-cfn
, where you can see all these properties and resource names.
Thank you so much Austin. I am very impressed by your responsiveness and receptivity to feedback, and it really means a lot when, eh, one might not have high expectations from a large organization like AWS.
So I was very excited about your copilot svc exec
command. Is this new? In the past I had looked for this and found nothing except an AWS re:Post question stating there was no way to access a running App Runner instance. (And some Stack Overflow answers saying the same).
Anyway, I tried running it and got a notice saying it would install the Session Manager plugin if I wanted, and I said yes, but it failed because I don't use "yum". (I'm on EndeavourOS / Arch.) I grabbed it from the AUR though and that's all well and good. After running it, it asks which environment I want, then says:
✘ executing a command in a running container part of a service is not supported for services with type: 'Request-Driven Web Service'
So maybe the App Runner stuff above is not inaccurate? :)
Otherwise, there are some things based on what you said that are puzzling to me, but I want to be very cognizant in case this goes beyond the scope of this conversation, in which case please feel free to dismiss my questions. I do feel like they are likely very common points of confusion for other users, however, and as such it may be helpful to you if I voice them, so I will. :)
[1] We're running a [Laravel, PHP-based] backend application with this. It talks to an RDS database and [I'm currently adding] an Elasticache Redis instance. I initially was faffing around with the Security Groups so that I could ensure App Runner could communicate (outbound) to the RDS Database that is already opened to some EC2 instances.
-- I'll pause to say that I know my understanding of Security Groups is imperfect, but I also feel like I have at least a "working" mental metaphor for them. --
2 EC2 instances share the same Security Group -- let's call it "StageSG" -- and our Stage RDS instance has a Security group that allows access from StageSG on the DB port. It seems to work.
We have a different Production RDS instance and a different Production EC2 instance that has a different pair of connected security groups (say, "ProdSG").
I was trying to get our Stage AppRunner/copilot environment to run in the same EC2 VPC with that same "StageSG" Security Group so it automatically works without affecting the RDS Security Group. (And, of course, have the Prod AppRunner/copilot environment use the existing Production VPC and existing ProdSG Security Group.)
This feels like it would be a massively common use case, I assume? But maybe I'm wrong. This is why I erroneously believed I could specify the egress security groups per-environment manifest.yml.
I guess my confusion is how the expected use case could be that the VPC/Security Group is specified for the entire service. I would think that VPCs/Security Groups almost always are different per environment?
[2] I read through the "Backend Service" manifest docs per your earlier comment about the network.vpc.security_groups
, and with attention to the environments
key, it seemed like I COULD override that network.vpc.security_groups
key per environment. I just tried this, and it did not seem to have any effect unfortunately. (I'm determining it didn't have an effect by viewing the "Networking" section of the App Runner UI. It has some security groups listed, but they aren't the ones I put into the Service's manifest.yml.)
This feels likely because I'm configured as a "Request-Driven Web Service"? The manifest documentation for that does not include the network.vpc.security_groups
key at all. So of course it was even less likely to work.
[3] It's not entirely clear to me why we override things in the service manifest.yml's environment
map versus, say, sticking them into environment files. (I'd previously been using that only doing it with variable overrides per environment.)
[4] I read through the differences between the different types of application service manifests when I first set this up a few months ago. I for some reason could not determine that one was obviously more-correct for us than another. Our EC2 instances are on a private network, accessible only to Internet Traffic through a Load Balancer or to developers through a bastion EC2 instance. So I might've thought I wanted a "backend service". But I definitely do want Internet-users somehow getting to it, so I thought maybe "Request-Driven Web Service" or "Load Balanced Web Service". Our app will primarily experience traffic during business hours in US timezones, so "Request-Driven" seemed more appropriately. But I'm not sure if that was a big mistaken and if I can even change it at this point.
Thanks for all you do.
@Offlein thanks so much for the kind words. Your use case makes a lot of sense and is quite common, we definitely want to make sure it's easy to connect your services to existing security groups. I agree that you might actually want a LBWS, but there are workarounds to make this easier on you so you don't have to migrate.
We actually support connecting App Runner to VPC resources. This feature shipped after app runner launched and involves a resource called an AWS::AppRunner::VpcConnector. This resource allows app runner to talk to services in a VPC, with or without specific security groups.
When you specify private placement for a RDWS:
network:
vpc:
placement: private
Copilot will create a VPC Connector for you, and allow it to talk to the EnvironmentSecurityGroup that Copilot creates and which all services in an env use to communicate.
To connect App Runner to another security group, after setting placement to private and deploying the service, you'll probably have to add some custom ingress and egress rules. You can model these in CFN with the AWS::EC2::SecurityGroupIngress and AWS::EC2::SecurityGroupEgress constructs. Copilot lets you deploy additional CFN resources via the addons functionality.
For example, to configure addons for your app runner service, you'd create the following files:
./copilot/yourservice/
└── addons/
├── template.yml
└── addons.parameters.yml
# template.yml
Parameters:
App:
Type: String
Env:
Type: String
Name:
Type: String
ServiceSecurityGroup:
Type: String
RDSSecurityGroup:
Type: String
Resources:
ServiceSecurityGroupEgressToRDSSecurityGroup:
Type: AWS::EC2::SecurityGroupEgress
Properties:
GroupId: !Ref ServiceSecurityGroup
IpProtocol: -1
DestinationSecurityGroupId: !Ref RDSSecurityGroup
RDSSecurityGroupIngressFromServiceSecurityGroup:
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !Ref RDSSecurityGroup
SourceSecurityGroupId: !Ref ServiceSecurityGroup
IpProtocol: -1
# addons.parameters.yml
Parameters:
RDSSecurityGroup: ${REPLACE_ME_SG_ID}
ServiceSecurityGroup: !Ref ServiceSecurityGroup # This references the security group from the parent workload template.
I hope this helps for your use case.
For the different values per environment problem, you can use Mappings in your addons template, like so:
#template.yml
Transform: 'AWS::LanguageExtensions'
Parameters:
App:
Type: String
Env:
Type: String
Name:
Type: String
ServiceSecurityGroup:
Type: String
Mappings:
RDSSecurityGroupIdMap:
test:
"Id": sg-1234
prod:
"Id": sg-5678
DefaultValue: noEnvironment
Conditions:
RecognizedEnvironment: !Not [ !Equals [ noEnvironment, !FindInMap [ RDSSecurityGroupIdMap, !Ref env, Id ] ] ]
Resources:
NewSecurityGroup:
Condition: !Not RecognizedEnvironment
Type: AWS::EC2::SecurityGroup
#...
NewSGIngressFromServiceSG:
Condition: !Not RecognizedEnvironment
Type: AWS::EC2::SecurityGroupIngress
RDSSecurityGroupIngressFromServiceSG:
Condition: RecognizedEnvironment
Type: AWS::EC2::SecurityGroupIngress
Properties:
GroupId: !FindInMap
- RDSSecurityGroupIdMap
- !Ref Env
- Id
SourceSecurityGroupId: !Ref ServiceSecurityGroup
IpProtocol: -1
# addons.parameters.yml
Parameters:
ServiceSecurityGroup: !Ref ServiceSecurityGroup
edited to include the new default value feature for FindInMap and conditional logic to create a new SG and ingress if the env isn't recognized
@bvtujo Just wanted to say thanks for all your help here. I had a bit of trouble understanding the whole interplay of Copilot, App Runner [sometimes] and ECS [sometimes], and CloudFormation, but after doing a lot of reading and fiddling, I think I'm in a better spot, thanks largely to your support.
This issue can be closed!
Since I upgraded to 1.30.0 today, my deploys started giving me this totally-easy-to-comprehend error:
I was able to figure out I should check my "prod" environment YAML file, which hasn't changed. The lines in question seem perfectly reasonable per the documentation at first glance, until you realize the docs flip back and forth between the expected parameters of:
network.vpc.security-group.xyz
andnetwork.vpc.security_group.xyz
I was using
security_group
with an underscore, but the correct answer wassecurity-group
with a dash.