futurewei-cloud / alcor-control-agent

Cloud native SDN platform - network control agent
MIT License
14 stars 29 forks source link

[Feature] Zeta integration implementation #149

Closed er1cthe0ne closed 3 years ago

er1cthe0ne commented 4 years ago

Context

This is to track the implementation work in ACA to support Zeta integration, it would follow the design document #147

Proposed Code Changes

(to be updated based on design document #147)

a. done - Goal state message change from Alcor server to ACA b. done - What are the new function names, should create the dummy functions in ACA b. Send traffic to different group buckets c. OAM packet handling, what is the new openflow rule to be programmed d. Test code updates

Remaining Items

  1. Can we still do remote IP specific outports (current implementation) at the source node and destination node or we have to switch to generic vxlan outport for current non zeta port plus zeta support port? Need to try it with manual rules to confirm.
  2. Think about the operation needed for add/delete group rule, direct path rule, outports. Think about how to manage the lifetime of them.
  3. Come up with the efficient data structure to store the needed info, don't want to use two data structure to store similar info. Except for the cache of valid oam udp port for quick check under ACA_OVS_Control::parse_packet. I have a proposed change on _vpcs_table hashtable to use tunnel ID as the key. Please take a look to see if that helps, https://github.com/futurewei-cloud/alcor-control-agent/pull/171
  4. Review the new testing instruction I put in: https://github.com/futurewei-cloud/alcor-control-agent/wiki/How-to-run-the-full-suite-of-aca_tests
  5. Confirm all the existing test are still passing with any new changes, share the test result

Other Items

  1. Address any remaining feedbacks I left in previous PRs: #158 #172 - note that they have been merge to offical master already.
  2. Review the warning mentioned in #172 and see if it requires follow up

@zhangml

zhangml commented 4 years ago

Hello, I will follow up on this issue.

er1cthe0ne commented 4 years ago

@zhangml - cool, I have assigned this issue to you. Do let me know if you have any questions on this.

zhangml commented 4 years ago

Hi, Eric, I want to know more about the meaning of "Goal state message change from Alcor server to ACA".

er1cthe0ne commented 4 years ago

Hi, Eric, I want to know more about the meaning of "Goal state message change from Alcor server to ACA".

Hi @zhangml - the Goal state message is used for communication between Alcor Server to ACA, with Zeta support, we will likely need to change the message format (aka contract). This is a needed change for Zeta and I am thinking the Alcor team (possibly me) will take care of it.

zhangml commented 4 years ago

In #148 Manual construction and testing of openflow rule for Zeta integration. To install the direct path, the following flow table needs to be issued:

root@computer7:/# ovs-ofctl add-flow br-tun table=0,in_port="patch-int",dl_dst=96:fc:fd:b2:cc:e9,priority=2,actions="resubmit(,2)"
root@computer7:/# ovs-ofctl add-flow br-tun table=24,priority=1,dl_vlan=100,actions="strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:"vxlan238""

At the same time, deleting the direct path also needs to delete these flow entries. However, the OAM Flow Deletion type data packet does not contain the mac address information of the destination instance.

er1cthe0ne commented 4 years ago

In #148 Manual construction and testing of openflow rule for Zeta integration. To install the direct path, the following flow table needs to be issued:

root@computer7:/# ovs-ofctl add-flow br-tun table=0,in_port="patch-int",dl_dst=96:fc:fd:b2:cc:e9,priority=2,actions="resubmit(,2)"
root@computer7:/# ovs-ofctl add-flow br-tun table=24,priority=1,dl_vlan=100,actions="strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:"vxlan238""

At the same time, deleting the direct path also needs to delete these flow entries. However, the OAM Flow Deletion type data packet does not contain the mac address information of the destination instance.

@zhangml - a few comments:

  1. looks like we need to adjust the manual construction of rules in #148, to make it fit into the zeta OAM design. @Zqy11
  2. please reference https://github.com/futurewei-cloud/alcor-control-agent/wiki/Openflow-Tables-Explain to see how ACA uses it
  3. for the first rule above, I think we can just send everything to table 2: "priority=1,in_port="patch-int" actions=resubmit(,2)"
  4. for the second rule above, we want to decide if thats for the (to be added #162) table 20 unicast, or (existing) table 22 multicast
  5. we should follow the "Matcher" fields (Matcher_SIP, Matcher_DIP, Matcher_SPORT, Matcher_DPORT, Matcher_Protocol, Matcher_VNI) to add the unicast rule to table 20, multicast rule to table 22. And use the same set of "Matcher" fields to delete the rules. https://github.com/futurewei-cloud/zeta/blob/main/docs/design/zeta_system_design.md#735-in-band-operation

To sum up, we will need to adjust the manual openflow rules. Please work with @Zqy11 on this. Thanks.

zhangml commented 4 years ago

In OAM patcket, what is the difference between Inner Packet DIP and Destination Inst IP? -- | -- | -- Matcher_SIP | 0 | Inner Packet SIP Matcher_DIP | 4 | Inner Packet DIP Matcher_SPORT | 8 | Inner Packet SPort Matcher_DPORT | 10 | Inner Packet DPort Matcher_Protocol | 12 | Inner Packet Protocol Matcher_VNI | 13 | VxLAN/Geneve vni DestInst_IP | 16 | Destination Inst IP DestNode_IP | 20 | Destination Node IP DestInst_MAC | 24 | Destination Inst MAC DestNode_MAC | 30 | Destination Node MAC Idle_Timeout | 36 | 0 - 65536s

er1cthe0ne commented 4 years ago

In OAM patcket, what is the difference between Inner Packet DIP and Destination Inst IP? -- | -- | -- Matcher_SIP | 0 | Inner Packet SIP Matcher_DIP | 4 | Inner Packet DIP Matcher_SPORT | 8 | Inner Packet SPort Matcher_DPORT | 10 | Inner Packet DPort Matcher_Protocol | 12 | Inner Packet Protocol Matcher_VNI | 13 | VxLAN/Geneve vni DestInst_IP | 16 | Destination Inst IP DestNode_IP | 20 | Destination Node IP DestInst_MAC | 24 | Destination Inst MAC DestNode_MAC | 30 | Destination Node MAC Idle_Timeout | 36 | 0 - 65536s

I think they are the same. @liangbin-pub can you confirm?

liangbin-pub commented 4 years ago

When Destination IP is a service virtual IP, the OAM will return with real IP (instance IP of the service vIP) for final encapsulation

zhangml commented 4 years ago

Hello, the first packet uploaded to zeta also passes through the vxlan tunnel, which means that a vxlan tunnel port needs to be added to the Zeta gateway and computing node. Who should perform this operation? Do I need to consider the creation of a vxlan tunnel port when processing AuxGateway messages?

liangbin-pub commented 4 years ago

Not sure what you mean by port here, one host will talk to thousands of hosts, including gateway, all with vxlan encap. If we have to create thousands of tunnel ports in order to do vxlan encap, that would be a nightmare. At gateway, we certainly won't rely on having a "tunnel port" to do encap/decap. It's just one field in our flow tables.

Remember, vxlan identifies a "network", based on which "network" the sender belongs to, has nothing to do with receiver. Hope this helps.

Thanks, Bin


From: zml notifications@github.com Sent: Thursday, November 12, 2020 2:35 AM To: futurewei-cloud/alcor-control-agent alcor-control-agent@noreply.github.com Cc: Bin Liang fw.liangb+gitnotify@gmail.com; Mention mention@noreply.github.com Subject: Re: [futurewei-cloud/alcor-control-agent] [Feature] Zeta integration implementation (#149)

Hello, the first packet uploaded to zeta also passes through the vxlan tunnel, which means that a vxlan tunnel port needs to be added to the Zeta gateway and computing node. Who should perform this operation? Do I need to consider the creation of a vxlan tunnel port when processing AuxGateway messages?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffuturewei-cloud%2Falcor-control-agent%2Fissues%2F149%23issuecomment-725927918&data=04%7C01%7Cbin.liang%40futurewei.com%7Cc606260988df47d812c008d886e5ecb2%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637407669358327828%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jTnKQaly8cv8jx12PxOQ%2BtCWmH%2BgXUaeBJVqPY18m28%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAMUSBBRSTFUAOCJKGTZWG3DSPOM5JANCNFSM4SXU63DA&data=04%7C01%7Cbin.liang%40futurewei.com%7Cc606260988df47d812c008d886e5ecb2%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637407669358337815%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Lbu6W4PjfwcfFub2SSyqMSn4VtN5bCbnkXsCO7T3F4A%3D&reserved=0.

zhangml commented 4 years ago

Not sure what you mean by port here, one host will talk to thousands of hosts, including gateway, all with vxlan encap. If we have to create thousands of tunnel ports in order to do vxlan encap, that would be a nightmare. At gateway, we certainly won't rely on having a "tunnel port" to do encap/decap. It's just one field in our flow tables. Remember, vxlan identifies a "network", based on which "network" the sender belongs to, has nothing to do with receiver. Hope this helps. Thanks, Bin

Okay, got it. In other words, the sending source does not care whether the destination vtep interface has been created, nor does it matter how the peer end decapsulates. But I have a new problem: the original data packet sent to the zeta gateway needs to encapsulate the gateway's ip and mac information in the outer layer, and this can be done through a vtep. But if there is no vtep corresponding to the gateway ip on the computing node after the zeta state message is delivered, do we need to create a vtep interface on the computing node? If necessary, is it done by the way in this step of “zeta goal state message processing”? I am currently completing the “aca_zeta_state_handler.cpp” function and encountered this problem.

zhangml commented 4 years ago

Hi,Eric. I have two programming questions.

  1. Should an operation type be added to auxiliarygateway.proto to indicate whether the group table will be updated or deleted?
  2. Because the format of the group table is slightly different from the normal flow table, for example, the openflow protocol version needs to be developed. Does ACA_OVS_Control::add_flow(const char bridge, const char opt) also support the addition of group tables?
  3. AuxGateway is included in VpcState. Should the gateway state processing be implemented in ACA_Dataplane_OVS::update_vpc_state_workitem(), or create a new file"ca_zeta_state_handler.cpp"?
er1cthe0ne commented 4 years ago

Okay, got it. In other words, the sending source does not care whether the destination vtep interface has been created, nor does it matter how the peer end decapsulates. But I have a new problem: the original data packet sent to the zeta gateway needs to encapsulate the gateway's ip and mac information in the outer layer, and this can be done through a vtep. But if there is no vtep corresponding to the gateway ip on the computing node after the zeta state message is delivered, do we need to create a vtep interface on the computing node?

I discussed with Bin this morning. The way ACA is doing today is to create the vtep outport based on remote IP. It is similar to openstack and other samples I found. Zeta implementation doesn't need to follow the same way. Please spend a day or two to see if there is a better way suggested by Bin's highlevel idea. Work with @Zqy11 on this. If you two confirmed there is no other way, then you can follow the current ACA approach.

If necessary, is it done by the way in this step of “zeta goal state message processing”?

Yes.

  1. Should an operation type be added to auxiliarygateway.proto to indicate whether the group table will be updated or deleted?

No, the AuxGateway message is part of VpcState message which already has operation_type.

  1. Because the format of the group table is slightly different from the normal flow table, for example, the openflow protocol version needs to be developed. Does ACA_OVS_Control::add_flow(const char bridge, const char opt) also support the addition of group tables?

@cj-chung - what do you think? Can @zhangml easily add that support? The group rule looks like: ovs-ofctl -O OpenFlow13 add-group br-tun group_id=100,type=select,bucket=output:vxlan231,bucket=output:vxlan232

  1. AuxGateway is included in VpcState. Should the gateway state processing be implemented in ACA_Dataplane_OVS::update_vpc_state_workitem(), or create a new file"aca_zeta_state_handler.cpp"?

let's do this for good code seperation: ACA_Dataplane_OVS::update_vpc_state_workitem - determine it is an zeta support AuxGateway, then pass the AuxGateway message to aca_zeta_state_handler.cpp for processing.

Note that the contract change has been merged into Alcor master. You can simply update your git clone's Alcor submodule to see the latest contract with preliminary Zeta support. https://github.com/futurewei-cloud/alcor/pull/467

er1cthe0ne commented 4 years ago

For the test environment setup and test code change. Can we have @Zqy11 to focus help on that? It is a critical item as we move into the next phrase of the project.

er1cthe0ne commented 4 years ago

@zhangml - FYI, this is how I consume the new contract in my branch: https://github.com/er1cthe0ne/alcor-control-agent/tree/port_delete

The highlevel steps are: cd alcor git checkout master cd .. cmake . (need to regenerate the make files because of the new auxiliarygateway.proto) make

And you can pull my change into your branch if you like :)

zhangml commented 4 years ago

Hello,Eric. I have updated the PR for tomorrow's discussion, but there are still some parts of the code that are not completed.

zhangml commented 3 years ago

158 I added an aca_oam_port_manager file to track and manage the life cycle of the oam port, and install oam punt rules.

zhangml commented 3 years ago

Hi,Eric @er1cthe0ne . Regarding GoalState &parsed_struct, I want to know that it saves information about all aspects of the network (such as vpcstate, portstate, subnetstate)? If so, is this information complete?

zhangml commented 3 years ago

I noticed that the parameter neighbor_id was added to the parameters of create_or_update_neighbor_port(). When issuing the direct route, if you need to create a neighbour_port, how do you get the corresponding neighbour_id?

int ACA_OVS_L2_Programmer::create_or_update_neighbor_port(
        const string neighbor_id, const string vpc_id, alcor::schema::NetworkType network_type,
        const string remote_host_ip, uint tunnel_id, ulong &culminative_time)
er1cthe0ne commented 3 years ago

Hi @zhangml - the current ACA implementation on neighbor communication is restrictive in the sense that both source and destination sides needs to create a specific "outport" using remote host IP in order to establish the communication (both both unicast and multicast).

@liangbin-pub and I discussed on this and we believe the security benefit of this approach is minimal, it will filter out bad traffic from a compromised host.

For Zeta, if we follow the same approach, it will mean ZGC has to send two OAM packets, one to source host and one to destination host for programming. Since the benefit is minimal as mention above, we will remove this "outport" per remote host IP requirement and use a generic VXLAN outport for both source and destination side. Either just one outport for all vxlan traffic, or it will be better to create one per VPC which has the same VNI and internal VLAN ID. Please work with @Zqy11 to manually try out how the rules would look like based on this approach.

Going back to your question on do you need neighbor_id to track Zeta information. It really depends on what we need to track when a zeta enabled port is created/updated/deleted and also what we need to track to handle OAM flow inject/delete packet.

Once we tested the manual rules which includes unicast/multicast and this generic vxlan outport, and have your PR fully merged with my port delete change. You can take a stab on the data structure needed. Many we can consider the below:

struct vpc_table_entry { uint vlan_id; // list of ovs_ports names on this host in the same VPC to share the same internal vlan_id list ovs_ports; // Eric: We may need to extend this to mark if the aux_gateway_type for the port is none(default) or Zeta // hashtable of output (e.g. vxlan) tunnel ports to the neighbor host communication // to neighbor port ID mapping in this VPC // unordered_map <outports, list of neighbor port IDs> unordered_map<string, list > outports_neighbors_table; // Eric: I don't think you need to worry about this // Eric: add flow rule tracking here, try to match on the Matcher_SIP+Matcher_DIP+Matcher_SPORT+Matcher_DPORT+Matcher_Protocol+Matcher_VNI(?) // or maybe we can track the flow rules in the port level to delete all flow rules when the port is deleted, but you will need to figure out which ovs_port is mapping to a flow rule, that's doable with change in this structure };

er1cthe0ne commented 3 years ago

@zhangml - FYI we just made another contract change addessing a list of planned items. Please review the corresponding ACA change in my branch which is planning to merge into master soon: https://github.com/er1cthe0ne/alcor-control-agent/tree/schema/update

zhangml commented 3 years ago

Hi,Eric. @er1cthe0ne Is there a way to find vlan_id or vpc_id through tunnel_id? Because when installing direct path, vlan_id needs to be converted to tunnel_id through flow table rules. ovs-ofctl add-flow br-tun table=20,priority=50,dl_vlan=100,ip,nw_src=192.168.1.71,nw_dst=192.168.1.81,actions="strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:"vxlan238""

er1cthe0ne commented 3 years ago

Hi,Eric. @er1cthe0ne Is there a way to find vlan_id or vpc_id through tunnel_id? Because when installing direct path, vlan_id needs to be converted to tunnel_id through flow table rules. ovs-ofctl add-flow br-tun table=20,priority=50,dl_vlan=100,ip,nw_src=192.168.1.71,nw_dst=192.168.1.81,actions="strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:"vxlan238""

@zhangml - we can add a few field into the below structure:

struct vpc_table_entry { uint vlan_id; uint tunnel_id // NEW // list of ovs_ports names on this host in the same VPC to share the same internal vlan_id list ovs_ports; // hashtable of output (e.g. vxlan) tunnel ports to the neighbor host communication // to neighbor port ID mapping in this VPC // unordered_map <outports, list of neighbor port IDs> unordered_map<string, list > outports_neighbors_table; };

and populate tunnel_id whenever a new vpc_table_entry maybe needed.

After that, we can simply iterate through all vpc_table_entry, find the one has the matching tunnel_id and then we can find the corresponding vlan_id.

zhangml commented 3 years ago

Hi,Eric. @er1cthe0ne Is there a way to find vlan_id or vpc_id through tunnel_id? Because when installing direct path, vlan_id needs to be converted to tunnel_id through flow table rules. ovs-ofctl add-flow br-tun table=20,priority=50,dl_vlan=100,ip,nw_src=192.168.1.71,nw_dst=192.168.1.81,actions="strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:"vxlan238""

@zhangml - we can add a few field into the below structure:

struct vpc_table_entry { uint vlan_id; uint tunnelid // NEW_ // list of ovs_ports names on this host in the same VPC to share the same internal vlan_id list ovs_ports; // hashtable of output (e.g. vxlan) tunnel ports to the neighbor host communication // to neighbor port ID mapping in this VPC // unordered_map <outports, list of neighbor port IDs> unordered_map<string, list > outports_neighbors_table; };

and populate tunnel_id whenever a new vpc_table_entry maybe needed.

After that, we can simply iterate through all vpc_table_entry, find the one has the matching tunnel_id and then we can find the corresponding vlan_id.

The oam port can also use a similar method to track, but if you want to search, because the key of the vpc_table is vpc_id, it may need to traverse all vpc_table_entry, and the result cannot be obtained in O(1) time through the hash.

er1cthe0ne commented 3 years ago

The oam port can also use a similar method to track, but if you want to search, because the key of the vpc_table is vpc_id, it may need to traverse all vpc_table_entry, and the result cannot be obtained in O(1) time through the hash.

I agree. Since VNI is unique for the region, do you think we can simply use VNI as the key to unordered_map<string, vpc_table_entry> _vpcs_table ? I think that's possible.

zhangml commented 3 years ago

The oam port can also use a similar method to track, but if you want to search, because the key of the vpc_table is vpc_id, it may need to traverse all vpc_table_entry, and the result cannot be obtained in O(1) time through the hash.

I agree. Since VNI is unique for the region, do you think we can simply use VNI as the key to unordered_map<string, vpc_table_entry> _vpcs_table ? I think that's possible.

I personally think that vpc_tables may need to be traversed twice during the entire oam packet processing. The first time is in aca_ovs_control::parse_packet(). In this process, you need to check whether the udp destination port is a valid oam server port. If you do not introduce a hash table with oam port as the key, you may need to traverse all vpc_tables once . The second time is to convert vlan_id to tunnel id and it needs to traverse all vpc_tables once. It may be time-consuming if two traversals are required for each oam packet processed. So I think it is better to use VNI as the key, which can eliminate the second traversal.

er1cthe0ne commented 3 years ago

For aca_ovs_control::parse_packet(), we need to be careful to not add extra latency because there could be quite a bit of packet_in message send to aca_ovs_control::parse_packet(), that includes OAM, DHCP, ARP, and other L3 packets for on demand routing. Because of that, we will need to introduce a known OAM ports cache to do the valid oam server port check quickly. My thinking is once confirmed it is a valid oam port, it will then send to our OAM packet processing code and re-check there to see it is a valid and known VNI before adding/deleting the direct path rule.

For the second pass, let's change the data structure to use VNI as key to eliminate the traveral.

zhangml commented 3 years ago

For aca_ovs_control::parse_packet(), we need to be careful to not add extra latency because there could be quite a bit of packet_in message send to aca_ovs_control::parse_packet(), that includes OAM, DHCP, ARP, and other L3 packets for on demand routing. Because of that, we will need to introduce a known OAM ports cache to do the valid oam server port check quickly. My thinking is once confirmed it is a valid oam port, it will then send to our OAM packet processing code and re-check there to see it is a valid and known VNI before adding/deleting the direct path rule.

For the second pass, let's change the data structure to use VNI as key to eliminate the traveral.

The outport created by alcor has its own neighbor id, while the outport created locally by aca such as oam packet does not have the neighbor id. So can we also abandon neighbor id? In this way, all neighbor ports can be stored in a table. The other way is to use the current neighbor table results, but assign a meaningless neighbor ID to all locally created neighbor ports.

struct vpc_table_entry {
  uint vlan_id;
  // list of ovs_ports names on this host in the same VPC to share the same internal vlan_id
  list<string> ovs_ports;
  // hashtable of output (e.g. vxlan) tunnel ports to the neighbor host communication
  // to neighbor port ID mapping in this VPC

  // unordered_set <outports>
  // Store all neighbor ports
  unordered_set<string> > outports_neighbors_table;

  uint32_t oam_port;
};

// unordered_map<tunnel id, vpc table entry>
// use tunnel id as key
unordered_map<uint32_t, vpc_table_entry> _vpcs_table;
er1cthe0ne commented 3 years ago

For the second pass, let's change the data structure to use VNI as key to eliminate the traveral.

My suggestion above will require changes to existing code. But I think it is still an approach that we can take.

The outport created by alcor has its own neighbor id, while the outport created locally by aca such as oam packet does not have the neighbor id. So can we also abandon neighbor id? In this way, all neighbor ports can be stored in a table.

The neighbor id was used to managed the life time of the outport, so multiple neighbors can share one outport, and the outport is only deleted when all the assoicated neighbors are deleted. We can simply abandon it without finding another way to track the life time.

The other way is to use the current neighbor table results, but assign a meaningless neighbor ID to all locally created neighbor ports.

Not sure if we can use a meaningless neighbor ID to track the life time of the outport.

zhangml commented 3 years ago

Hi,Eric. I am performing functional testing of my code and encountered some problems. I installed unicast rules through the following code.

 string outport_name =
          aca_get_outport_name(alcor::schema::NetworkType::VXLAN, action.node_nw_dst);
  string cmd_action = "actions=strip_vlan,load:" + match.vni +
                      "->NXM_NX_TUN_ID[],mod_dl_dst=" + action.inst_dl_dst +
                      ",mod_nw_dst=" + action.inst_nw_dst + ",output:" + outport_name;

  // Adding unicast rules in table20
  string opt = "idle_timeout=" + action.idle_timeout + ",table=20,priority=50," +
               cmd_match + "," + cmd_action;
  overall_rc = ACA_OVS_Control::get_instance().add_flow("br-tun", opt.c_str());

In ACA_OVS_Control::add_flow(), the output port will report an error:output to unknown port. But I found that the port has been created by running "ovs-vsctl show", and running the flow table command directly on the console can be installed successfully. My outport_name is obtained through the "aca_get_outport_name()" method. I guess in ACA_OVS_Control::add_flow(), is there any other ID representation for outport_name?

er1cthe0ne commented 3 years ago

Hi,Eric. I am performing functional testing of my code and encountered some problems. I installed unicast rules through the following code.

 string outport_name =
          aca_get_outport_name(alcor::schema::NetworkType::VXLAN, action.node_nw_dst);
  string cmd_action = "actions=strip_vlan,load:" + match.vni +
                      "->NXM_NX_TUN_ID[],mod_dl_dst=" + action.inst_dl_dst +
                      ",mod_nw_dst=" + action.inst_nw_dst + ",output:" + outport_name;

  // Adding unicast rules in table20
  string opt = "idle_timeout=" + action.idle_timeout + ",table=20,priority=50," +
               cmd_match + "," + cmd_action;
  overall_rc = ACA_OVS_Control::get_instance().add_flow("br-tun", opt.c_str());

In ACA_OVS_Control::add_flow(), the output port will report an error:output to unknown port. But I found that the port has been created by running "ovs-vsctl show", and running the flow table command directly on the console can be installed successfully. My outport_name is obtained through the "aca_get_outport_name()" method. I guess in ACA_OVS_Control::add_flow(), is there any other ID representation for outport_name?

The problem you describe is tracked by number 4 in #120. We didn't have a chance to investigate that yet. For the time being, please use the old way to add flows: execute_openflow_command

Please go ahead to push what you have into your branch. I may go in and fix some existing test failures this weekend.

zhangml commented 3 years ago

Hi.Eric. Is the id in the "message auxgateway" also a UUID, or is it a string corresponding to a specific number, such as "111"? Because each group entry in ovs needs to specify a group id, which is a number.

-O OpenFlow13 add-groups br-tun group_id=111

er1cthe0ne commented 3 years ago

Hi.Eric. Is the id in the "message auxgateway" also a UUID, or is it a string corresponding to a specific number, such as "111"? Because each group entry in ovs needs to specify a group id, which is a number.

-O OpenFlow13 add-groups br-tun group_id=111

the "id" field is the ZGC id, that's an UUID which have 32 hex of info:

"id": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6",

It will be better to generate an unique number to use for the group entry and track it inside ACA. See the below as an example, be sure to think about reclaiming when resource is deleted:

static atomic_uint current_available_vlan_id(1);

Another solution is the hash on the ZGC id to int, but there is always a chance of hash collision even the possibility is small.

er1cthe0ne commented 3 years ago

@zhangml Below is the output to confirm options:remote_ip=0.0.0.0 is accepted in OVS:   ovs-vsctl --may-exist add-port br-tun vxlan-generic2 -- set interface vxlan-generic2 type=vxlan  options:df_default=true options:egress_pkt_mark=0 options:in_key=flow options:out_key=flow options:remote_ip=0.0.0.0       Bridge br-tun         Port "vxlan-generic2"             Interface "vxlan-generic2"                 type: vxlan                 options: {df_default="true", egress_pkt_mark="0", in_key=flow, out_key=flow, remote_ip="0.0.0.0"}   options:remote_ip=flow works also:   ovs-vsctl --may-exist add-port br-tun vxlan-generic3 -- set interface vxlan-generic3 type=vxlan  options:df_default=true options:egress_pkt_mark=0 options:in_key=flow options:out_key=flow options:remote_ip=flow       Bridge br-tun         Port "vxlan-generic3"             Interface "vxlan-generic3"                 type: vxlan                 options: {df_default="true", egress_pkt_mark="0", in_key=flow, out_key=flow, remote_ip=flow}

As @liangbin-pub mentioned, let's go with options:remote_ip=flow approach, the next step is to try out the openflow rules manually to confirm it is able use it to send out packets from the source node, and receive packets on the destination node.

Details please see  https://github.com/openvswitch/ovs/blob/17f22fe46142ef0402bff0e3eb9a4768d93b8008/vswitchd/vswitch.xml Line 2701-2774

zhangml commented 3 years ago

@zhangml Below is the output to confirm options:remote_ip=0.0.0.0 is accepted in OVS:   ovs-vsctl --may-exist add-port br-tun vxlan-generic2 -- set interface vxlan-generic2 type=vxlan  options:df_default=true options:egress_pkt_mark=0 options:in_key=flow options:out_key=flow options:remote_ip=0.0.0.0       Bridge br-tun         Port "vxlan-generic2"             Interface "vxlan-generic2"                 type: vxlan                 options: {df_default="true", egress_pkt_mark="0", in_key=flow, out_key=flow, remote_ip="0.0.0.0"}   options:remote_ip=flow works also:   ovs-vsctl --may-exist add-port br-tun vxlan-generic3 -- set interface vxlan-generic3 type=vxlan  options:df_default=true options:egress_pkt_mark=0 options:in_key=flow options:out_key=flow options:remote_ip=flow       Bridge br-tun         Port "vxlan-generic3"             Interface "vxlan-generic3"                 type: vxlan                 options: {df_default="true", egress_pkt_mark="0", in_key=flow, out_key=flow, remote_ip=flow}

As @liangbin-pub mentioned, let's go with options:remote_ip=flow approach, the next step is to try out the openflow rules manually to confirm it is able use it to send out packets from the source node, and receive packets on the destination node.

Details please see  https://github.com/openvswitch/ovs/blob/17f22fe46142ef0402bff0e3eb9a4768d93b8008/vswitchd/vswitch.xml Line 2701-2774

After a simple test, "remote_ip=flow" should be able to achieve our functions. I added the following flow tables on the two nodes:

computer1 172.16.62.237
ovs-ofctl add-flow br-tun table=22,priority=50,dl_vlan=1,actions="set_field:172.16.62.239->tun_dst,strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:vxlan-generic"

or ovs-ofctl add-flow br-tun table=22,priority=50,dl_vlan=1,actions="strip_vlan,load:0x1->NXM_NX_TUN_ID[],output:vxlan-239"

computer2 172.16.62.239

ovs-ofctl add-flow br-tun table=0,priority=25,in_port="vxlan-generic",actions="resubmit(,4)"
ovs-ofctl add-flow table=4,priority=1,tun_id=0x1,action=mod_vlan_vid:1,output:"patch-int"

Build docker71:192.168.1.71 on computer1 , and build docker91:192.168.1.91 on computer2 ,and set vlan_id to 1. docker91 can receive the traffic sent by docker71.

The main difference between "remote_ip=flow" and "remote_ip=neighbor_ip" is that the former needs to set the tun_dst field through the openflow flow table. For example, set_field:ip->tun_dst. remote_ip=flow means tunnel destination IP will be set by an OpenFlow action. This allows us to add different actions for different destinations using the single OVS/OF port.

er1cthe0ne commented 3 years ago

@zhangml - to support the scale requirement, ACA (non-zeta ports) will likely switch to using generic vxlan outport. Here is my manual experiment to confirm that it should work: https://github.com/futurewei-cloud/alcor-control-agent/wiki/Testing-environment-for-using-generic-vxlan-outport

I haven't start that change yet. Here is the change I have been doing to remove the top level vpc_table lock and code changes for performance analysis: https://github.com/er1cthe0ne/alcor-control-agent/tree/perf/hashtable

zhangml commented 3 years ago

@er1cthe0ne Can a vpc have several auxGateway at the same time? Can a vpc have multiple different types of auxGateway at the same time?

er1cthe0ne commented 3 years ago

@er1cthe0ne Can a vpc have several auxGateway at the same time? Can a vpc have multiple different types of auxGateway at the same time?

@zhangml - a vpc can have:

  1. one or zero Zeta auxGateway
  2. one or zero internet auxGateway (likely one)
  3. one or zero NAT auxGateway (I think)
zhangml commented 3 years ago

@er1cthe0ne Can a vpc have several auxGateway at the same time? Can a vpc have multiple different types of auxGateway at the same time?

@zhangml - a vpc can have:

  1. one or zero Zeta auxGateway
  2. one or zero internet auxGateway (likely one)
  3. one or zero NAT auxGateway (I think)

Can these three types of auxgateway be owned at the same time? For example, a vpc has both a zeta and a NAT?

er1cthe0ne commented 3 years ago

Can these three types of auxgateway be owned at the same time? For example, a vpc has both a zeta and a NAT?

yes, a VPC can have Zeta and NAT auxGateway at the same time.

er1cthe0ne commented 3 years ago

implementation merged as #180