arkime / aws-aio

Apache License 2.0
8 stars 3 forks source link

Enable Cross-Account Capture #109

Closed chelma closed 1 year ago

chelma commented 1 year ago

Description

This task is to update the AWS-AIO repo so that it's possible to perform cross-account capture (the GWLBE is in a different account than the GWLB).

See here for an explanation of how this will work: https://github.com/arkime/aws-aio/issues/109#issuecomment-1682526890

Acceptance Criteria

chelma commented 1 year ago

After looking at the code/docs, here's a tentative plan-of-action for this:

chelma commented 1 year ago

Additional consideration - we have some commands (like clusters-list) which should continue to operate normally when called on the capture account. This means we'll need to update our data store to somehow link the source/capture accounts in the capture account data store. Practically, this probably means that when we perform the registration dance we'll register a specific source VPC/account with the capture account rather than a capture account as a whole. This has the side benefit of also scoping down the whitelists we're applying to our GWLB.

chelma commented 1 year ago

Thinking about this more, we have a couple different approaches we can take with the data store.

One is to allow source accounts to interact with a central data store in the capture account. This will (likely) require the least number of code changes, simplifies some architecture decisions, and is a well-supported AWS paradigm. The flipside is that we'll need to be able to reasonably scope down the cross-account permissions. Ideally, we'd scope the permissions to an explicit list of actors in the source account but I think that will be quite burdensome in-practice. As a result, we'll probably grant the root principal of the source account perms to perform very limited actions on very limited resources.

The other approach is to bulkhead the two accounts. In the short term, this would require a lot more thought about precisely how to accomplish this effectively but would limit (or even eliminate) the need for cross-account data store calls. In the longer term, I believe this will have on ongoing impact on design avenues - which may be a good thing, and may not. It will likely create substantially more friction for users and be more complex to operate, but maybe we really do want to treat these accounts as separate entities.

Another consideration (somewhat) tied to this decision is how to handle multiple actors acting on the data store at the same time. We currently assume there's only one actor performing cluster-level operations at a given time, but that may not always be the case. The problem presents itself for both conjoined and bulkheaded accounts, but seems worth bringing up.

Overall, I think the right decision (for now) is to allow cross-account data store operations, as it feels like we can scope the permissions down pretty tightly in terms of operations/resources and seems to provide a better user experience. It's possible that some users need the bulkhead approach, but we don't have positive data points in that direction yet and can pivot later if needed.

chelma commented 1 year ago

It appears going the cross-account access route makes this fairly straightforward. Here's an updated plan of action, in the order a user would invoke them:

Overall, the cross-account user workflow would be:

  1. cluster-create using capture account credentials
  2. cluster-register-vpc using capture account credentials
  3. vpc-register-cluster using source account credentials
  4. vpc-add using source account credentials
  5. vpc-remove using source account credentials
  6. vpc-deregister-cluster using source account credentials
  7. cluster-deregister-vpc using capture account credentials
  8. cluster-destroy using capture account credentials

The same-account workflow would remain unchanged:

  1. cluster-create using capture account credentials
  2. vpc-add using capture account credentials
  3. vpc-remove using capture account credentials
  4. cluster-destroy using capture account credentials
chelma commented 1 year ago

PR posted for the vast majority of the work. Remaining effort:

chelma commented 1 year ago

OK, so - clusters-list currently finds which VPCs are being monitored by looking for Parameter Store entries of the form /arkime/clusters/<cluster name/vpcs/<vpc id>. That entry is created by the CloudFormation stack that spins up the VPC-specific resources, which means in the cross-account scenario it isn't present in the Cluster account because that stack is created in the VPC account. This means we can either create a duplicate entry in the Cluster account or do a cross-account call from the Cluster account. The later option seems like the better approach to avoid additional bookkeeping but we'll need to make a new cross-account role the Cluster Account can use to get access to the VPC account (easy).

Plan of attack:

chelma commented 1 year ago

PR posted to cover the remaining work; task should be complete after it is merged.

chelma commented 1 year ago

PRs merged; acceptance criteria met. Resolving.