Closed wido closed 5 years ago
Sounds like a config issue, @wido can you ensure that BGP is not automatically enforced for cloud0. I think from KVM agent perspective it is not aware what kind of network topology/model exists, it simply works on the idea of a network device.
@rhtyd It isn't. The 169.254.0.1 address is reserved for BGP unnumbered configurations.
As CloudStack uses this subnet as a hardcoded configuration it conflicts with it.
Aah okay it makes sense now @wido - what workaround do you propose? For example, one good solution could be to get rid of use of link-local based addresses/nics. VR/systemvm programming can be done via an IP on the mgmt/private network cidr.
Our Vmware implementation does not use link-local as well, and all communications are done directly to the private IP in the private address range (typically in RFC1918). This change is do-able but may cause issues for users who don't have enough free address/ips in the pod/range for/in private/mgmt network. What do you think?
cc @PaulAngus @andrijapanic @borisstoyanov @anuragaw @shwstppr @rafaelweingartner @DaanHoogland @fmaximus @ustcweizhou @GabrielBrascher @nvazquez @svenvogel @NuxRo and others - what do you think of getting rid of link-local IP based nics/programming/communication for VRs?
@rhtyd it seems to be a big change. can we look for a workaround for @wido at first ?
Personally I don't care for the link local, however I imagine there are folks who do and have tooling relying on it etc. Like Wei, I'd say this is a big change so let's look at a workaround for Wido, a configurable variable in the properties could work.
@rhtyd I am +1 on implementing a workaround. I will be happy to discuss and help designing other approaches, but I think that we should first invest some effort on the workaround, at least for now.
We already have extended CloudStack KVM agent with a hotfix for a 4.12.0.0 environment, implemented by @wido. It looks good on the KVM agent side. It was added a few parameters for the agent.properties
. If they are not configured, ACS keeps using de current default values.
network.linklocal.cidr=169.254.0.0/16
network.linklocal.address=169.254.0.1/16
network.linklocal.gateway=169.254.0.1
network.linklocal.netmask=255.255.0.0
Unfortunately, we still need some work on the CloudStack management side, otherwise, it still managing system vms with the 169.254.0.0/16
hardcoded network.
It is worth mentioning that we configured the global settings control.gateway
and control.cidr
before configuring the new zone; however, the Control network kept with the "hardcoded" setup.
@wido @GabrielBrascher are you sending a PR soon? The 4.13 freeze is coming soon ina week's time, given you may have explored or applied a workaround in your env and that the specific use case may not affect most users can we revisit this in either 4.13.1.0 or 4.14? If there is no hurry perhaps a longer term proper fix to remove link-local nic with private/mgmt nic may be explored. I'll remove the 4.13.0.0 milestone, but feel free to add it if a PR could be sent.
We are working on a PR and hope to submit it today
@wido @GabrielBrascher If all kvm agents use the same setting, I would suggest you to use same way as described in 714221234d41920ccb131367cca000cd4da7b261 so when we change the global setting the new value will be propagated to all kvm agents when they connect. Just a suggestion.
network.linklocal.cidr=169.254.0.0/16
network.linklocal.address=169.254.0.1/16
network.linklocal.gateway=169.254.0.1
network.linklocal.netmask=255.255.0.0
only need one of address and gateway, and one of cidr and netmask
@ustcweizhou Thanks for the suggestion! I was looking for a way to send the global setting control.cidr to the KVM Agent.
I'll look into this.
@wido I have created pull request for our another change. #3491
We are setting up a BGP+EVPN+VXLAN setup using Frr and BGP Unnumbered and this is causing some problems with the cloud0 bridge created by the cloudstack agent.
Although the global setting control.cidr can be modified, the KVM Agent will still create this bridge with a hardcoded subnet:
com/cloud/hypervisor/kvm/resource/BridgeVifDriver.java
When using BGP Unnumbered it will try to create a route pointing to 169.254.0.1
This works until the CloudStack Agent is started:
After the CloudStack Agent is started the 169.254.0.1/16 is added to cloud0 and not allowing Frr to create these routes:
The solution would be that through agent.properties this CIDR can be controlled and isn't hardcoded.