Closed GraysonWu closed 2 years ago
Thank you for the report. I see the issue. This is definitely an internal bug in Open vSwitch. Open vSwitch, like OpenFlow 1.1 and later, distinguishes between "instructions" and "actions", but OVS tends to treat the difference internally as mostly an inconvenience and tries to make them as similar as it can. In this case, there's a failure to deal with the GOTO_TABLE instruction properly. I'll see if I can fix that.
In the meantime, you can avoid the problem here by using the "resubmit" OVS extension action instead of the OF goto_table instruction. Some people might complain that they don't want to use OF extensions, but "pause/resume" are already a major extension, so I doubt it will be a problem here.
I sent out a proper fix: https://mail.openvswitch.org/pipermail/ovs-dev/2021-July/385200.html
@blp Thank you so much for starting this fix so quickly. And before that, I'll see if resubmit
could be a workaround in our case.
My env
My goal I want to use the controller action with the
pause
flag to pause the pipeline. Later, the controller sends thecontinuation
back to the OVS to resume the pipeline.What I have done I used the
NXAST_RAW_CONTROLLER2
message within the flow mod message to add a flow with controller action with thepause
flag on the OVS. This part works well, as I can see it is added successfully:And then I sent an
NXT_SET_PACKET_IN_FORMAT
message from the controller to the OVS to set the packet_in format asOFPUTIL_PACKET_IN_NXT2
. After these setups, I created a traffic which hit the flow above. The controller could receive the packet_in2 message successfully. But when the controller sent aNXT_RESUME
message back, I met the issue.Issue I met After I send
NXT_RESUME
message to the OVS, I received OpenFlow1.3 error: OFPBAC_BAD_ARGUMENT. The OVS log:But since I didn't change anything in the
NXPINT_CONINUATION
that I received, why the action inside theNXPINT_CONINUATION.NXCPT_ACTIONS
is invalid?I checked the packet_in2 that the controller received from the OVS:
Based on my understanding, the
NXPINT_CONINUATION
part inside that packet_in2 message above are these bytes:Details of these bytes:
NXPINT_CONINUATION
:00 08 00 40 00 00 00 00
(type:0x0008, length: 0x0040, pad[4])NXCPT_BRIDGE
:80 00 00 14 36 d5 68 3d
87 42 dd 11 a3 a8 f2 88
0a ae 84 b3 00 00 00 00
(type: 0x8000, length: 0x0014, bridge: uuid, pad[4])NXCPT_CONNTRACKED
:80 03 00 04 00 00 00 00
(type:0x8003, length:0x0004, pad[4])NXCPT_ACTIONS
:80 06 00 10 00 00 00 00
00 01 00 08 69 00 00 00
(type: 0x8006, length: 0x0010, pad[4], actions[])NXCPT_ODP_PORT
:80 08 00 08 00 00 00 04
(type: 0x8008, length: 0x0008, odp_port: 4) So in theNXCPT_ACTIONS
, there is only one action:00 01 00 08 69 00 00 00
. And I guess the encoding and decoding of this part is where the problem lies.OVS encodes actions into bytes
Seems like that while the OVS sending the packet_in2 to the controller, the OVS encodes the
goto_table:105
, the action after pause in my flow above, into theNXCPT_ACTIONS
. The structure ofgoto_table
and the type of instruction are: https://github.com/openvswitch/ovs/blob/master/include/openflow/openflow-1.1.h#L249 https://github.com/openvswitch/ovs/blob/master/include/openflow/openflow-1.1.h#L273Thus, if we encode
goto_table:105
into bytes, it will be00 01 00 08 69 00 00 00
(type: 0x0001, len: 0x0008, table_id: 0x0069=105, pad[3]) which is exactly the same as what I received.OVS decodes bytes into actions
When the controller send the
NXPINT_CONINUATION
back to the OVS, OVS should decode00 01 00 08 69 00 00 00
to action. But while decoding the00 01
, seems that the OVS treat it asOFPACT_SET_VLAN_VID
instead of an instructiongoto_table
. The structure ofofpact_vlan_vid
is: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/ofp-actions.h#L414Then
69 00
will be treated asvlan_vid
. An error will be throughout from here https://github.com/openvswitch/ovs/blob/master/lib/ofp-actions.c#L1566.The resume message I sent to the OVS is:
As you can see I definitely set the OpenFlow version as 0x04 which means OpenFlow1.3: https://github.com/openvswitch/ovs/blob/master/include/openflow/openflow-common.h#L76
I‘m not sure if I misunderstood anything. I’m stuck here. How can I resume the pipeline in my situation?
BTW what props should be included in a
NXT_RESUME
? I currently just added all props that I received from packet_in2 message exceptNXPINT_USERDATA
.Thank you!!