CentaurusInfra / mizar

Mizar – Experimental, High Scale and High Performance Cloud Network https://mizar.readthedocs.io
https://mizar.readthedocs.io
GNU General Public License v2.0
111 stars 50 forks source link

Key to hash table for agent metadata flips between pointing to null and valid data. #29

Open phudtran opened 4 years ago

phudtran commented 4 years ago

What happened When making consecutive update_agent_metadata calls, every other call will fail in the following scenario:

  1. A droplet D1 is hosting endpoint ep1 from net-1 from VPC-1, and endpoint ep-2 from from net-2 from VPC-2.
  2. net-1, net-2, (and consequently ep-1 and ep-2) are both deleted.
  3. net-1, net-2, are recreated.
  4. ep-1 and ep-2 are both recreated on the same droplet D1.
  5. Every other call to update_agent_metadata for ep-1 and ep-2 will fail.

Setting ep->data to null in the macro INTF_DELETE() seems to be the issue. trn_transitd.h:81 However removing that line will reintroduce an old bug.

What you expected to happen update_agent_metadata should not fail How to reproduce it

    self.droplets = {
        "d1": droplet("d1"),
        "d2": droplet("d2"),
        "switch-1": droplet("switch-1"),
        "router-1": droplet("router-1")
    }

    c = controller(self.droplets)

    # Create two co-hosted VPCs each with one endpoint
    c.create_vpc(3, cidr("16", "10.0.0.0"), ["router-1"])
    c.create_network(3, 10, cidr("24", "10.0.0.0"), ["switch-1"])
    self.ep1 = c.create_simple_endpoint(3, 10, "10.0.0.2", "d1")

    c.create_vpc(4, cidr("16", "20.0.0.0"), ["router-1"])
    c.create_network(4, 20, cidr("24", "20.0.0.0"), ["switch-1"])
    self.ep5 = c.create_simple_endpoint(4, 20, "20.0.0.2", "d1")

    # Delete their networks
    c.delete_network(3, 10, cidr("24", "10.0.0.0"))
    c.delete_network(4, 20, cidr("24", "20.0.0.0"))

    # Recreate the networks and endpoints
    c.create_network(3, 10, cidr("24", "10.0.0.0"), ["switch-1"])
    c.create_network(4, 20, cidr("24", "20.0.0.0"), ["switch-1"])

    self.ep1 = c.create_simple_endpoint(3, 10, "10.0.0.2", "d1")
    self.ep5 = c.create_simple_endpoint(4, 20, "20.0.0.2", "d1")

Now try to update agent metadata for ep1. It will fail every other time.

Environment Ubuntu Server 18.04 LTS (HVM) on ec2 t2.2xlarge More Information Deleting just the endpoint will not reproduce the bug issue. We have to either delete the network, or delete the vpc.

zasherif commented 4 years ago

It might be the right time to consider using another hashtable..

phudtran commented 4 years ago

I'm currently looking into uthash.

chenpiaoping commented 4 years ago

After analysis, I think the hash table does not support delete operation, the following are my verification steps: 1.Add some log information to trn_transitd.h file

define INTF_INSERT() \

do {                                                                                        \
    ENTRY e, *ep;                                                                       \
    e.key = itf;                                                                        \
    e.data = (void *)md;                                                                \
    ep = hsearch(e, FIND);                                                              \
    if (ep) {                                                                           \
        if (ep->data) {                                                             \
            TRN_LOG_DEBUG("overwriting metadata at [%s]",                       \
                      itf);                                                 \
            TRN_FREE(ep->data);                                                 \
        }                                                                           \
        TRN_LOG_ERROR("overwriting metadata, ep->data is null");               \
        ep->data = (void *)md;                                                      \
    } else {                                                                            \
        TRN_LOG_DEBUG(                                                              \
            "inserting [%s] for first time. allocate memory for interface key," \
            " since RPC XDR will eventually free its value.",                   \
            itf);                                                               \
        e.key = malloc(sizeof(char) * (strlen(itf)+ 1));                                 \
        strcpy(e.key, itf);                                                         \
        e.key[strlen(itf)] = '\0';          \
        ep = hsearch(e, ENTER);                                                     \
    }                                                                                   \
    if (!ep)                                                                            \
        return -1;                                                                  \
    return 0;                                                                           \
} while (0)

Add the statement TRN_LOG_ERROR ("overwriting metadata, ep-> data is null") in the macro INTF_FIND;

define INTF_FIND() \

do {                                                                   \
    ENTRY e, *ep;                                                  \
    e.key = itf;                                                   \
    ep = hsearch(e, FIND);                                         \
    if (!ep) {                                                     \
        return NULL;                                           \
    }                                                              \
    TRN_LOG_ERROR("Find ep, return ep->data:%p", ep->data);        \
    return ep->data;                                               \
} while (0)

Add the statement TRN_LOG_ERROR ("Find ep, return ep-> data:% p", ep-> data) in the macro INTF_FIND;

  1. Follow the steps above to create the endpoints, We look at the core steps: 1) Create endpoint10.0.0.2 and endpoint20.0.0.2: load_transit_agent_xdp_1_svc (gnv_peer1) ---> load_transit_agent_xdp_1_svc (gnv_peer3) 2) Remove network10 and network20 3) Create endpoint10.0.0.2 and endpoint20.0.0.2: load_transit_agent_xdp_1_svc (gnv_peer0) ---> load_transit_agent_xdp_1_svc (gnv_peer1)

View syslog: root@1c81bb2d3435:/# cat /var/log/syslog | grep "inserting" Feb 17 07:13:18 1c81bb2d3435 transit[149]: [trn_itf_table_insert:72] inserting [eth0] for first time. allocate memory for interface key, since RPC XDR will eventually free its value. Feb 17 07:13:29 1c81bb2d3435 transit[149]: [trn_vif_table_insert:87] inserting [gnv_peer1] for first time. allocate memory for interface key, since RPC XDR will eventually free its value. Feb 17 07:13:31 1c81bb2d3435 transit[149]: [trn_vif_table_insert:87] inserting [gnv_peer3] for first time. allocate memory for interface key, since RPC XDR will eventually free its value. Feb 17 07:13:34 1c81bb2d3435 transit[149]: [trn_vif_table_insert:87] inserting [gnv_peer0] for first time. allocate memory for interface key, since RPC XDR will eventually free its value. root@1c81bb2d3435:/# cat /var/log/syslog | grep "overwriting" Feb 17 07:14:05 1c81bb2d3435 transit[149]: overwriting metadata, ep->data is null

Therefore, calling the macro INTF_DELETE cannot delete gnv_peer1 exactly. View hsearch help, which has the following description:

Bugs SVr4 and POSIX.1-2001 specify that action is significant only for unsuccessful searches, so that an ENTER should not do anything for a successful search. In libc and glibc (before version 2.3), the implementation violates the specification, updating the data for the given key in this case.

Individual hash table entries can be added, but not deleted.

3.Cause the ENTRY of hash table cannot be deleted, If set the key to NULL when deleting, I think there will be uncertain behavior(need to read the source code of hsearch()).we may only need to set the value to NULL when deleting.The bug is not recurring after deleting "free(ep->key);" from INTF_DELETE.

define INTF_DELETE () \

do {\ TRN_LOG_DEBUG ("Delete: [% s]", itf); \ ENTRY e, * ep; \ e.key = itf; \ ep = hsearch (e, FIND); \ if (ep) {\ TRN_FREE (ep-> data); \ TRN_LOG_DEBUG ("Deleted: [% s]", itf); \ } \ } while (0)      Finally, Cause the ENTRY of hash table cannot be deleted, there will be a memory leak. May we can consider using another hashtable.

zasherif commented 4 years ago

Thanks, @chenpiaoping. Yes, we shall consider a new hash table. Can you augment your analysis and see if the metadata, can be simply stored in an ebpf map (hash)? We don't need that table values solely in user-space. We currently move data selectively from it to ebpf maps. It might be better to directly have these interface metadata stored with the XDP program in a map and of course still accessible to userspace.