NetFPGA / P4-NetFPGA-public

P4-NetFPGA wiki
103 stars 31 forks source link

Switch Calculator LOOKUP not working #70

Open nithishkgnani opened 2 years ago

nithishkgnani commented 2 years ago

I have done the basic setup of the OS following the official GitHub repository (P4-NetFPGA). I tried to compile and run the three assignments available in the tutorials. The assignments 2 & 3 (TCP monitor and INT) haven't yielded the expected results - The python test scripts don't print the expected results.

Assignment 1 (Switch Calculator) supports five operations: ADD, SUBTRACT, ADD_REG, SET_REG and LOOKUP. I am able to implement every operation except LOOKUP. It is supposed lookup the given key in a table on the switch and return the result. But it always returns 0. I have included the switch_calc_tester output at the bottom.

I am able to compile my P4 programs and download the bit file into the board, but I do not know how to troubleshoot/debug the issue. While following the steps in the tutorial, the simulations pass in every assignment. I have also referred to and tried out the steps written in Debugging P4 Programs under the Workflow-Overview page.

I need help with debugging the P4-NetFPGA-SUME workflow in both simulations and on the hardware during runtime.

switch_calc_tester output:

root@xxxxxx:~/Documents/P4-NetFPGA/contrib-projects/sume-sdnet-switch/projects/switch_calc/sw/hw_test_tool# ./switch_calc_tester.py 
WARNING: No route found for IPv6 destination :: (no default route?)
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
The HW testing tool for the switch_calc design
 type help to see all commands
testing> run_test 5 + 9
Begin emission:
Finished to send 1 packets.
*
Received 1 packets, got 1 answers, remaining 0 packets
Sent pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:5          OP_CODE:0          OP2:9          RESULT:0                 |
------------------------------------------------------------------------------------------

Received pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:5          OP_CODE:0          OP2:9          RESULT:14                |
------------------------------------------------------------------------------------------

testing> run_test 64 - 30
Begin emission:
Finished to send 1 packets.
*
Received 1 packets, got 1 answers, remaining 0 packets
Sent pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:64         OP_CODE:1          OP2:30         RESULT:0                 |
------------------------------------------------------------------------------------------

Received pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:64         OP_CODE:1          OP2:30         RESULT:34                |
------------------------------------------------------------------------------------------

testing> run_test 1 SET_REG 25
Begin emission:
Finished to send 1 packets.
*
Received 1 packets, got 1 answers, remaining 0 packets
Sent pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:1          OP_CODE:4          OP2:25         RESULT:0                 |
------------------------------------------------------------------------------------------

Received pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:1          OP_CODE:4          OP2:25         RESULT:0                 |
------------------------------------------------------------------------------------------

testing> run_test 1 ADD_REG 50
Begin emission:
Finished to send 1 packets.
*
Received 1 packets, got 1 answers, remaining 0 packets
Sent pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:1          OP_CODE:3          OP2:50         RESULT:0                 |
------------------------------------------------------------------------------------------

Received pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:1          OP_CODE:3          OP2:50         RESULT:75                |
------------------------------------------------------------------------------------------

testing> run_test 1 LOOKUP 1
Begin emission:
Finished to send 1 packets.
*
Received 1 packets, got 1 answers, remaining 0 packets
Sent pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:1          OP_CODE:2          OP2:1          RESULT:0                 |
------------------------------------------------------------------------------------------

Received pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:1          OP_CODE:2          OP2:1          RESULT:0                 |
------------------------------------------------------------------------------------------

testing> run_test 2 LOOKUP 1
Begin emission:
Finished to send 1 packets.
*
Received 1 packets, got 1 answers, remaining 0 packets
Sent pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:2          OP_CODE:2          OP2:1          RESULT:0                 |
------------------------------------------------------------------------------------------

Received pkt: 
------------------------------------------------------------------------------------------
|  ETHERNET  | OP1:2          OP_CODE:2          OP2:1          RESULT:0                 |
------------------------------------------------------------------------------------------

testing> 
MarioPatetta commented 2 years ago

This looks similar to the same problem I'm having.

As far as I understood, some motherboards are just incompatible with the Sume board and any communication over PCIe will fail. In this case, the lookup operation returns 0 because the script used to program the FPGA cannot populate the LUT (the FPGA is programmed over USB, but the LUT is filled by commands issued through PCIe).

I am currently working on an extern function LUT that can be directly filled by the P4 program in order to work around this issue. In the mean time you can have a look here to have more info about the motherboard compatibility topic.

salvatorg commented 2 years ago

If you have identify that the PCIe/driver is the problem then I would suggest to check the latest compatible version of the sume_driver with your current kernel version.

There is a workaround (kudos to @marcinwoj ) on this if you are using kernels > 5.6.0.

You ll have to patch $SUME_FOLDER/lib/sw/std/driver/sume_riffa_v1_0_0/sume_riffa.c (line. 847)

/* Callback when the TX side went on  watchdog time vacation. */  
static void   
#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,6,0)   
sume_tx_timeout(struct net_device *netdev, unsigned int txqueue)   
#else   
sume_tx_timeout(struct net_device *netdev)   
#endif