net: add refcounts for buffs in virtrx to account for broadcast packets

Kswin01 commented 4 months ago

This change adds a reference count for each buffer in the virt rx component. This allows us to properly handle broadcast packets that need to be sent to multiple clients, and ensuring that we are still adhering to the consumer/producer pattern that is expected. On broadcasts, we set the refcount to the number of clients in the system, and for unicast just 1. We decrement the refcount when we receive a buffer back from the client, and enqueue back into the driver queues when the refcount reaches zero.

Courtney3141 commented 4 months ago

Sorry to cause more unexpected work, but I felt it was necessary to remove the arp component for this pull request to be merged, since with these changes the system would now be handling broadcast traffic by passing the packet to each client (including the arp), resulting in an extra superfluous reply from the arp in addition to one or more clients.

I made the least number of changes possible to remove the arp. I did not remove any arp code, I just removed any arp dependencies from its neighbours.

I also made a few changes to how the rx_virt implements broadcasting. Firstly, I moved some of the behaviour behind an #if BROADCAST_IS_ARP. By default I set this to 0, which enables the multi-forwarding of packets to each client, but if it is disabled, the rx_virt returns to forwarding broadcast packets to a single broadcast (arp) component. In both cases, the buffer reference counts are still incremented and decremented. This could be useful if we were to support multi-casting in the future.

Secondly, I restructured rx_return in the virtualiser so that the header of incoming packets is only checked once (I combined get_client and is_broadcast into a single function get_mac_addr_match). I believe that this flow of control is neater - the virtualiser determines in one function call whether the packet is broadcast, unicast, or not for our clients, then processes it accordingly. If BROADCAST_IS_ARP is enabled, the broadcast handling branch is #ifdefd out, and the broadcast address will instead match with the arp client, so the packet will be passed to the arp in the unicast handling branch. If `BROADCAST_IS_ARP is disabled, no client will match with the broadcast MAC address, so instead it will be passed to all clients in the broadcast handling branch.

I ran ipbench on the system both with BROADCAST_IS_ARP enabled and disabled, and both completed successfully. However, I suspect that DHCP and ipbench would run successfully whether broadcast traffic was being correctly handled or not since broadcast handling isn't particularly important (at least when DHCP messages are addressed to client MAC addresses). So I would recommend you re-running your tests where DHCP is addressed to a broadcast address.

One thing that was notable and I suspect expected was that performance was slightly worse when broadcast traffic was forwarded to both clients. I put this down to two things, one being that both clients are now required to reply to a single message, as well as now the bench-marking target (client 0) both has to reply to the benchmark traffic, as well as any broadcast traffic that comes in during testing, which previously was allocated to the arp. I suspect that LWIP also takes longer than the arp to handle broadcast traffic anyway, as well as the fact that the arp does not use a copier. In saying that, it seems that CPU utilisation only goes up by around 1-1.5% points.

Courtney3141 commented 4 months ago

Forgot to update the system file for the ordroidc4 which is causing the builds to fail, I'll do that now

Kswin01 commented 4 months ago

Thanks for adding that! Another method I was considering for supporting both ARP and general broadcasts was as follows:

/* Return the client ID if the Mac address is a match to a client, return the broadcast ID if MAC address
   is a broadcast address. */
int get_mac_addr_match(struct ethernet_header *buffer)
{
    bool broadcast_packet = false;

    for (int client = 0; client < NUM_CLIENTS; client++) {
        bool match = true;
        bool is_broadcast = true;
        for (int i = 0; (i < ETH_HWADDR_LEN) && (match || is_broadcast); i++) {
            if (buffer->dest.addr[i] != state.mac_addrs[client][i]) {
                match = false;
            }
            if (buffer->dest.addr[i] != 0xFF) {
                is_broadcast = false;
            }
        }

        if (match) {
            return client;
        } else if (is_broadcast) {
            // Mark the existance of a broadcast packet
            broadcast_packet = true;
        }
    }

    // We will only get here if this is a broadcast packet
    // and no client match has been made.
    if (broadcast_packet) {
        return BROADCAST_ID;
    }

    return -1;
}

Making it so that if a client has 'reserved' the broadcast MAC address then we forward the packets to them, otherwise we would broadcast to all clients. This will remove the need for the ifdefs, but not sure how much of an improvement this would actually make.

Courtney3141 commented 4 months ago

Thanks for adding that! Another method I was considering for supporting both ARP and general broadcasts was as follows:
/* Return the client ID if the Mac address is a match to a client, return the broadcast ID if MAC address
  is a broadcast address. */
int get_mac_addr_match(struct ethernet_header *buffer)
{
   for (int client = 0; client < NUM_CLIENTS; client++) {
       bool match = true;
       bool is_broadcast = true;
       for (int i = 0; (i < ETH_HWADDR_LEN) && match; i++) {
           if (buffer->dest.addr[i] != state.mac_addrs[client][i]) {
               match = false;
           }
           if (buffer->dest.addr[i] != 0xFF) {
               is_broadcast = false;
           }
       }

       if (match) {
           return client;
       } else if (is_broadcast) {
           return BROADCAST_ID;
       }
   }
   return -1;
}
Making it so that if a client has 'reserved' the broadcast MAC address then we forward the packets to them, otherwise we would broadcast to all clients. This will remove the need for the ifdefs, but not sure how much of an improvement this would actually make.

No sure how to reply on your comment! I think that method is more slick than the ifdef, but I think it could potentially fail and return BROADCAST_ID if the arp or general broadcast component was not the first client on the list? It was before, but I suppose it technically doesn't have to be.

Kswin01 commented 4 months ago

No sure how to reply on your comment! I think that method is more slick than the ifdef, but I think it could potentially fail and return BROADCAST_ID if the arp or general broadcast component was not the first client on the list? It was before, but I suppose it technically doesn't have to be.

The ARP should be able to be at any position, the above will only return broadcast if the mac matches the broadcast MAC address and there isn't a client that's reserved that MAC.

Courtney3141 commented 4 months ago

Actually just realised there is a bigger issue with the code. Let's say the incoming address was: FF0000000000 The first client's address was: AAAAAAAAAAAA And of course the broadcast address is: FFFFFFFFFFFF

Then in the first loop of the client for loop, the check of the first uint8 of the MAC address would set match=false for the client, but keep is_broadcast=true for the broadcast address. Then the MAC address checking for loop would terminate since match==false now breaking the loop guard.

Then in the if else statement that it run before starting to check the next client, the initial if would not be run since match==false, but the else if with is_broadcast would run, since the prior for loop did not complete the full mac address check, so the function would return BROADCAST_ID even though it was only a match on the first uint8

Kswin01 commented 4 months ago

Actually just realised there is a bigger issue with the code. Let's say the incoming address was: FF0000000000 The first client's address was: AAAAAAAAAAAA And of course the broadcast address is: FFFFFFFFFFFF

Then in the first loop of the client for loop, the check of the first uint8 of the MAC address would set match=false for the client, but keep is_broadcast=true for the broadcast address. Then the MAC address checking for loop would terminate since match==false now breaking the loop guard.

Then in the if else statement that it run before starting to check the next client, the initial if would not be run since match==false, but the else if with is_broadcast would run, since the prior for loop did not complete the full mac address check, so the function would return BROADCAST_ID even though it was only a match on the first uint8

Apologies, you are 100% right. That was a bad example of what I was trying to get across. I fixed up the issues properly and edited that comment.

Ivan-Velickovic commented 3 months ago

Needs a rebase before we can merge.

wom-bat commented 3 months ago

With this change in place, the odroidc4 version no longer works.

BENCH|LOG: Faulting PD virt_rx (2)
Registers: 
pc : 200728
spsr : 80000040
x0 : 2c00c20
x1 : c3
x2 : 200
x3 : 2c00000
x4 : c3
x5 : 8001
x6 : 100000
x7 : fffffffffffffffd
VMFault: ip=200728  fault_addr=2c00002  fsr=9341004e (data fault)
<<seL4(CPU 0) [receiveIPC/142 T0x8030002c00 "child of: 'rootserver'" @202144]: Reply object already has unexecuted reply!>>

Ivan-Velickovic commented 3 months ago

ethernet_config.h should have been updated as well which I missed in the review.

au-ts / sddf

net: add refcounts for buffs in virtrx to account for broadcast packets #92