lyft / cartography

Cartography is a Python tool that consolidates infrastructure assets and the relationships between them in an intuitive graph view powered by a Neo4j database.
https://lyft.github.io/cartography/
Apache License 2.0
2.96k stars 331 forks source link

Ec2RouteTable support #410

Open krisek opened 4 years ago

krisek commented 4 years ago

Description: At the moment Cartography doesn't give detailed information about traditionally routed environments (ie. environments not using Transit Gateway) in AWS. The aim is to develop the required code to import this information as well.

I identified the following relations so far

(Ec2RouteTable) <- [ASSOCIATED] - (EC2Subnet)

(Ec2RouteTable) - [ROUTES] -> (EC2Route)

(AWSVpc) - [RESOURCE] -> (Ec2RouteTable)

(AWSAccount) - [RESOURCE] -> (Ec2RouteTable)

(EC2Route) - [CIDR] -> (AWSCidrBlock)

(EC2Route) - [PREFIXLIST] -> (AWSPrefixList)

(AWSPrefixList) - [CIDR] -> (AWSCidrBlock)

(EC2Route) - [GATEWAY] -> (InternetGateway|VPCPeering (!)|NATGateway) (Relation doesn't exist for local prefixes.)

There is already one issue that needs to be clarified: VPC_PEERING is already existing as relation between AWSCidrBlocks, this has to be refactored.

Relevant Links: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeRouteTables.html

achantavy commented 4 years ago

Appreciate you taking the time to put this together. I think follow-up PRs should include just one new relationship each to make them easier to review.

There is already one issue that needs to be clarified: VPC_PEERING is already existing as relation between AWSCidrBlocks, this has to be refactored.

The reasoning for this schema choice is that VPCs are peered through their CIDR blocks. The path looks something like

(i) VpcA-[:BLOCK_ASSOCIATION]-CidrBlockA-[:VPC_PEERING]-CidrBlockB-[:BLOCK_ASSOCIATION]-VpcB

In this case, an instance connected to BlockA can directly talk to BlockB, and we wanted to make that as clear as possible.

What is your proposed refactoring? I think you're proposing to add a new "shortcut" relationship equivalent to (i):

VpcA-[:NAME_TO_BE_DECIDED]-VpcB
krisek commented 4 years ago

Appreciate you taking the time to put this together. I think follow-up PRs should include just one new relationship each to make them easier to review.

No problem. Separating PRs is agreed.

There is already one issue that needs to be clarified: VPC_PEERING is already existing as relation between AWSCidrBlocks, this has to be refactored.

The reasoning for this schema choice is that VPCs are peered through their CIDR blocks. The path looks something like

(i) VpcA-[:BLOCK_ASSOCIATION]-CidrBlockA-[:VPC_PEERING]-CidrBlockB-[:BLOCK_ASSOCIATION]-VpcB

In this case, an instance connected to BlockA can directly talk to BlockB, and we wanted to make that as clear as possible.

What is your proposed refactoring? I think you're proposing to add a new "shortcut" relationship equivalent to (i):

My idea is to turn the relation/edge into a node, so that we can connect a VPC_PEERING with EC2_ROUTEs.

So the current scheme could turn into

VpcA-[:BLOCK_ASSOCIATION]-CidrBlockA-[:PEERING]-VPC_PEERING-[:PEERING]-CidrBlockB-[:BLOCK_ASSOCIATION]-VpcB

But the VPC_PEERING would appear in actual routes as well:

(EC2Subnet) - [ASSOCIATED] - (Ec2RouteTable) - [ROUTES] - (EC2Route) - [GATEWAY] - (VPC_PEERING)- [GATEWAY] - (EC2Route) - [ROUTES] - (Ec2RouteTable) - ASSOCIATED] - (EC2Subnet)

This way we could use the database to check routing between IP endpoints. (VPC peerings are tricky, in order to have connectivity, the peering needs to be there, routing needs to be there and the security groups as well.) On complex networks with multiple dozens of VPC peerings across multiple AWS accounts, it can be a tedious work to troubleshoot connectivity issues.

On the other side Neo4j is flexible enough to keep both ways of maintaining the relation between the CidrBlocks through a VPC peering. This can be a temporary as well, to get users used to the new path without breaking actual reports. (Adding a 'deprecated' property to the VPC_PEERING relation/edge, can be the preparation for the transition. -- but maybe who cares if both paths remain.

achantavy commented 4 years ago

My idea is to turn the relation/edge into a node, so that we can connect a VPC_PEERING with EC2_ROUTEs.

Ah, this makes sense.

This way we could use the database to check routing between IP endpoints.

I like this a lot. I'm supportive of the change.

@marco-lancini @SecPrez you have any reports that would break from this?

In general our schema change process has been informal and lightweight, and I think it's fine that way so far. I think it's sufficient to socialize the major schema-breaking stuff on Slack and if no one complains, ship it.

marco-lancini commented 4 years ago

I might have some queries (https://github.com/marco-lancini/cartography-queries/blob/master/queries/queries.json) impacted by this, but it won't be difficult to update them

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

krisek commented 3 years ago

FYI started to work on it in https://github.com/krisek/cartography/tree/%23410-Ec2RouteTable-support As we discussed, the push will happen in stages, first the re-implementation of peering connections, and then the routing tables and then eventually other things like nat gateway, etc.

krisek commented 3 years ago

@achantavy @marco-lancini can we agree that we keep the cleanup jobs for the 'old style' VPC_PEERING relations? This way these would be deleted in favour for the new node based representation when first running the new sync logic. Alternative: keep the old VPC_PEERING relations and let the users delete them manually (Once they realize that these relations doesn't get updated anymore.)

I'm in favour for the first option.

krisek commented 3 years ago

BTW sneak preview... this data model really follows now AWS

image

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

krisek commented 3 years ago

The respective pull request is pending for quite a while... can somebody please review?

achantavy commented 3 years ago

Reviewing now.

@sachafaust: you might find this interesting since you built the original VPC peering module. I'll cc you in the PR as well.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

aashishgaba-cn commented 3 years ago

Hi @krisek, any updates regarding the support for Route Table? Thanks

krisek commented 3 years ago

Hi @aashishgaba-cn, I had some other priorities in the last few weeks, so no much progress. Hopefully I can pick it up soon... first step is to get PeeringConnections rework merged. #487

aashishgaba-cn commented 3 years ago

Hi @krisek. sounds good. In case you haven't started working on the Route Table and if you don't mind, I can pick it up, and the support for the same. Later on, when you're done with PeeringConnections work, you may add it :)

A route can have these parameters

How about I add the relationships for the ones that the cartography currently supports(marked with [Y]) and later we can keep on adding for the missing ones?

What do you guys(you and @achantavy) think about this? Pardon me if I marked [Y] to anything that can't be added as of now, we can definitely discuss and select them.

krisek commented 3 years ago

Sure, please go ahead, much appreciated... I have two open pull requests, I didn't want to increase the entropy.

I assume the resulting code will be the same anyway nevetheless who writes it :)

krisek commented 3 years ago

Started working on this.