Open msg-programs opened 2 years ago
Thanks for the reported issue. Please check out the bugfix branch. The broadcasting issue has already been fixed there, and after a code review it will be merged with the master.
code review it will be merged with the m
but you dont merge after half year, good job
I tried to get a pure v1model version of the layer 2 switch from the examples (l2switch.p4) to work for the last couple of days. The program kept crashing with the following error:
Other programs worked fine and when I inserted the table entries via the P4RT-Shell myself, the program behaved as expected. The crashes only happened when the switch was trying to broadcast, yet there were no packets emitted by it.
Long story short, I suspect a bug in
broadcast_packet()
. From my understanding, broadcasting is implemented by checking if the egress port is 100 and then calling that method. It then enumerates the ports, creates copies if needed and then usessend_single_packet()
(same file, L. 52) to send the packets out.When that happens, it sends them out of whatever the
egress_port
is. At that point, that is still port 100 i.e. the symbolic broadcast port. In my case, this causes issues indpdk_send_packet()
. The packets queue up untilMAX_PKT_BURST
is reached and are then forcefully sent in a burst, which results in a segfault.In my case, changing the second argument to the call to
send_single_packet()
fromegress_port
toportidx
results in correct behavior. I suspect this to be a bug, but as my understanding of the source is very limited, I could be wrong and this fix could have unintended side effects. Here is the modifiedbroadcast_packet()
function that works for me:I did some very limited testing to see if setting the port to any invalid value causes the same issue. Sending packets to port 11 or 31 results in the error
lcore 2 called tx_pkt_burst for not ready port [...]
and a stack trace, while sending some to port 33, 99 or 101 causes a segfault. Sending packets to port 32 has a weird side effect, where printing theport
in the functiondpdk_send_packet()
prints 56786. This also results in a segfault. It could be that this is another special port that I'm not aware of.Setup
Running in a Ubuntu 21.04 VM, 4 cores, 8 GB RAM
examples.cfg:
with pieal and piports defined as
veth0-s
andveth1-s
are veth devices that are set up as follows:veth0-h
in network namespaceh0
, ip is192.168.100.100
veth1-h
in network namespaceh1
, ip is192.168.100.101
veth0-s
andveth1-s
in the default namespaceThe P4 program is linked below. It's basically the l2switch.p4 from the example directory, but without the preprocessor macros that make it work with both available architectures. I had to rename the file, as GitHub doesn't support uploading .p4 files for some reason.
l2switch_v1model.p4.txt