Closed philtay closed 1 year ago
Same questions as OP. It would be good to understand better the pros and cons of commiting cdk.context.json
. For example on our environment we have 2 organisation accounts staging
and production
. On our CI we use different IAM credentials to deploy to each account, in this case is there a benefit of commiting it?
As far as I've understand you'd only commit the file if you want to be able to run synth without having to login to AWS and possible speed things up a bit since is saves some lookups.
Another thing I'd like to understand is the difference between cdk.json
and cdk.context.json
. I noticed if I put things like "@aws-cdk/core:enableStackNameDuplicates": "true",
in cdk.json
once I run diff they are moved to cdk.context.json
, not sure why.
As far as I've understand you'd only commit the file if you want to be able to run synth without having to login to AWS and possible speed things up a bit since is saves some lookups.
Not really, depending on what's in there. If there are 3 AZs in your region and you deploy a CDK app, a VPC will be created with a specific IP layout.
If you don't commit context.json
and on the next deploy AWS happens to have added an AZ to your region, your VPC will now try to span 4 AZs, have a different IP space layout and all your subnets will have to be destroyed and recreated (this will probably fail so you won't actually lose any data, but your deployment will be stuck with no way to move forward).
Similarly, if you've started an instance off of an Amazon Linux AMI ID, and there now happens to be a new version of that AMI available... do you want your instance to be automatically replaced with a new version of the OS? What if you had state on that machine?
cdk.context.json
is not just an API call cache, it's a store of nondeterministic decisions taken in the past, which you must commit in order to ensure consistency in the future. It also gives you the opportunity to update the values piecemeal. ("YES I will take a new AMI ID now, NO I will not be spreading to new AZs").
As for account IDs, we never considered them very sensitive, similar to how usernames aren't usually considered sensitive. AWS itself publishes many of its own account IDs for users to put in their IAM policies, for example. The keys to the castle are in the access credentials to the account, not the account ID itself.
In fact, we generally expect people to put account IDs in their source as well (or build their application to read them from a configuration file), especially in the upcoming CI/CD implementation where we're going to force it.
It may be possible that our opinions differ from other departments whose specialization is security. The biggest risk I personally see is bucket sniping.
I see a couple of solutions to this, but they all come with their own downsides:
Account IDs in context file:
cdk.context.json
, so nothing to commit (yay). Would not get rid of account ids in the source files though.Account IDs in source:
$prod1
, $prod2
, $prod3
etc. The next complication then becomes: where do we store the mapping from { "$prod1": "123456789012" }
? It will then be the user's responsibility to make sure the mapping is transported and kept in sync between all developer's machines. Also, these values could not be stored in the context.json file directly, because there's no guarantee that $prod1
on one machine is the same account as $prod1
on another machine, so we'd need one of the other solutions in addition to this.I think the end result of this will be that we may implement one of the solutions for the context, and it will be users responsibility to hide account IDs from their sources if desired, to be implemented in a way that suits them.
thanks for taking the time @rix0rrr. That provided a lot of clarity on things I was wondering about.
I'm still not completely clear though, I understand the use case I just think I'm missing the the development workflow.
Since we run cdk diff
+ cdk deploy
as part of our CI/CD pipeline. When am I supposed to update/create cdk.context.json
, should I just run cdk synthesise
while I'm developing to make sure cdk.context.json
has the right data in and then commit?
When am I supposed to update/create
cdk.context.json
, should I just runcdk synthesise
while I'm developing to make surecdk.context.json
has the right data in and then commit?
Yes.
Honestly it's probably also a good idea to add a flag for the CLI to fail unexpected context lookups (--no-lookups
), which you can then use in CI/CD to make sure the output is deterministic (@shivlaks I just came up with another feature request :) )
@rix0rrr @shivlaks should we create a new issue for the CLI feature request and close this one out?
This one has a lot of good info in it. Definitely interested in this one.
@shivlaks ping
@eladb sorry, this one skipped my notifications inbox. That makes sense to me - saw that you already re-titled it. I dropped the guidance label as well. Let's use this issue to track the feature if there are no objections.
@eladb Unless I have misread the thread above entirely, I believe this issue can now be closed as fixed:
$ cdk --help
…
--lookups Perform context lookups (synthesis fails if this is
disabled and context lookups need to be performed)
[boolean] [default: true]
…
Closing this issue based off the previous comment.
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.
The docs state that
cdk.context.json
should be checked-in to ensure reproducibility. However, that file may contain information that's usually considered sensitive like AWS account numbers (e.g. aws-samples). Such data shouldn't be commited and utilities like git-secrets actively try to prevent that. Could you please share what's the best practice in these cases?