Add support for LAG in `vsphere_virtual_distributed_switch`

afallucc commented 3 years ago

Description

When configuring DVS, need to include link aggregation settings.

Potential Terraform Configuration

References

https://docs.vmware.com/en/VMware-vSphere/6.0/com.vmware.vsphere.networking.doc/GUID-34A96848-5930-4417-9BEB-CEF487C6F8B6.html

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

benohara commented 2 years ago

Also raised with hashicorp as 1201697339237161 including the ability to add hosts to a DVS and migrate their vmk0 to the lag.

spacegospod commented 1 month ago

I did some research and it looks like we have the necessary APIs to add LAG support. There is a method called UpdateDVSLacpGroupConfig_Task in govmomi which can be used to reconfigure a distributed switch and add aggregation groups.

This is, unfortunately, where the good news end. There is no way to implement LAGs in a way that makes practical sense without introducing breaking changes.

TL;DR It is impossible to target individual uplink/lag ports for physical adapter mapping. And this is a top-priority when working with aggregation groups.

The long story Lets start with the following assumption - an admin would either use LAGs or uplinks on their switch but it makes little sense to mix both. This is where the problem begins - r/vsphere_distributed_virtual_switch is implemented in such a way that it does not allow physical adapter to uplink port mapping. It only supports automatic assignment. This means that each new physical adapter (per host) that gets added to the switch gets assigned to the next available uplink (the one with the least number of assignees). This rule applies to LAGs as well and since a switch must have at least one uplink this means that an admin would have to assign at least one physical adapter to that uplink before reaching the aggregation groups.

Here is an example of how the resource would work if we were to avoid breaking the current schema. Let's imagine a host with 3 physical adapters - vmnic1, vmnic2 and vminc3, and a DVS with 1 uplink uplink1 and 1 LAG lag1 with 2 ports. If we were to assign all 3 adapters to that DVS the mapping would be as follows: vmnic1 -> uplink1 vmnic2 -> lag1-0 vmnic3 -> lag1-1 Now it gets worse. If we were to remove "vmnic1" from the list of adapters the entire mapping would get shifted. vmnic2 -> uplink1 vmnic3 -> lag1-0 This is unacceptable since it would defeat the purpose of grouping multiple adapters in a LAG in the first place. Not to mention that it would potentially impact the bandwidth available to the LAG.

Ideas for a solution One way to attack this problem would be to add optional support for uplink port mapping so that physical adapters can be assigned to individual uplink/aggregation ports.

hashicorp / terraform-provider-vsphere