AWSVPC enhancements - Githubissues

bboreham commented 8 years ago

Following on from #2091, some ideas for enhancement:

try to donate bigger ranges on claim, up to half the space available (to reduce tiny-range fragmentation)
the "don't donate more than half" calculation seems to be slightly off - if you have allocated 1 address then you should still give away the bigger half, not drop down to a quarter.
multi-transport AWSVPC/fastdp/sleeve
copy the MTU from the AWS network interface on the host (or default it to 9001 which seems pretty standard for AWS)
support isolation between groups of containers (perhaps via subnets)
allow users to have more than 50 hosts (limit from AWS route table), maybe with some hybrid of overlay and VPC(s)
make weave launch async (currently it is blocking on expose)
Make update ops of the route table asynchronous (i.e., move to a separate goroutine).
Improve semantics for choosing a peer for donations.
Add tests for plugin
What happens if $RCIDR is not found
Enable --awsvpc in other tests
Test re-attach of containers

Smoke-test enhancements:

do not delete keys
create a separate VPC for each run
AWS=1 might be set for GCE machines, add the reset cmd

In progress:

docs

Done:

Test with >2 hosts
Get rid of ethtool tx off for AWSVPC
Rename useAWSVPC -> isAWSVPC

bboreham commented 8 years ago

~~Figure out some way to have the last peer remove the last route table entry on weave reset~~

(fixed by #2356)

brb commented 8 years ago

Just to add more context to the "do not delete keys" item: each time we run tests on AWS, we replace the global weavenet_ci key with the newly generated one (due to missing GC functionality in CircleCI). Thus, concurrent builds might fail in obscure ways.

brb commented 8 years ago

CNI / Plugin: Add the default subnet check and keep the TX offloading on.
When starting the AWSVPC tracker check that src/dst check is disabled on host VM.

bboreham commented 8 years ago

I had an issue where the route table had a left-over entry, and containers wouldn't communicate until I deleted it manually:

Destination	Target	Status	Propagated
172.31.0.0/16	local	Active	No
0.0.0.0/0	igw-0563c660	Active	No
10.32.0.0/13	eni-139cc059 / i-d9055d53	Active	No
10.40.0.0/13	eni-cc603c86 / i-26055dac	Active	No
10.40.0.0/14	eni-139cc059 / i-d9055d53	Active	No

$ weave report
...
        "Entries": [
            {
                "Token": "10.32.0.0",
                "Size": 524288,
                "Peer": "26:a1:90:ca:f0:13",
                "Nickname": "ip-172-31-23-45",
                "IsKnownPeer": true,
                "Version": 4
            },
            {
                "Token": "10.40.0.0",
                "Size": 524288,
                "Peer": "76:7a:68:8b:8d:b6",
                "Nickname": "ip-172-31-19-176",
                "IsKnownPeer": true,
                "Version": 3
            }
        ],

Maybe when a router takes over a range it should remove any route entries which cover a part of that range.

bboreham commented 8 years ago

@errordeveloper suggested creating an extra ENI may have advantages, e.g. we could disable the source/dest check just for that interface.

bboreham commented 8 years ago

Find some way to make clear to users when they have hit the 50-entry limit in the route table.

Currently it seems that errors will show up in the log file and some hosts will experience lack of inward routing.

brb commented 8 years ago

UPDATED: Disable the "src/dst" check on a VM. Default AWS policy is to deny packets which IP does not originate from a subnet (to prevent L3 spoofing), so in order to make the "awsvpc" work, on each VM we should disable the check manually. However, we could disable the check from the AWSVPC tracker in Weave which could result in less confused users.

rade commented 8 years ago

Disable the "src/dst" check on a VM.

What does that do?

bboreham commented 7 years ago

We can allow customers to run containers in multiple AZs by having one subnet for each AZ and attaching the same routing table to each subnet. Then have Weave manipulate that routing table.

panga commented 7 years ago

@bboreham I've the same issue of bad route tables when I add/remove new nodes with Auto Scaling Groups.

My auto-scaling groups has minimum 1 and maximum 3 instances. After some auto scaling I have 2 instances running, but only 1 valid route in the table and a lot of Black Holes.

Destination	Target	Status
10.32.0.0/13	eni-57f0ae51	Black Hole
10.40.0.0/13	eni-91db8297 / i-c4c017d2	Active
10.40.0.0/14	eni-54f0ae52	Black Hole
10.44.0.0/15	eni-54f0ae52	Black Hole
10.46.0.0/15	eni-6d673e6b	Black Hole

bboreham commented 7 years ago

Thanks for the comment @panga; this should probably go in its own issue - it looks like a bug rather than an enhancement.

panga commented 7 years ago

@bboreham Created a new issue :) https://github.com/weaveworks/weave/issues/2539

weaveworks / weave

AWSVPC enhancements #2322