p4lang / behavioral-model

The reference P4 software switch
Apache License 2.0
536 stars 327 forks source link

thrift.transport.TTransport.TTransportException: TSocket read 0 bytes #433

Closed chenxiang2019 closed 7 years ago

chenxiang2019 commented 7 years ago

Hi all.

My environment: Ubuntu14.04 64bit, with the newest version of bmv2 and p4c-bm installed. I use simple_switch engine for my program.

I write a simple P4-14 program today and want to enable arp broadcast in this program. There are some problems with my program so that it didn't work correctly. And thus I fulfil a counter to check whether the arp packets are processed or not.

The way I process incoming arp packets is to broadcast them.

P4-14 program(simple bone):

#include "includes/header.p4"
#include "includes/parser.p4"

// debug counter

counter debug {
    type : packets;
    static : arp;
    instance_count : 10;
}

// actions

action _drop() {
    drop();
}

action _nop() {
}

action ipv4_forward(macAddr, port) {
    modify_field(standard_metadata.egress_spec, port);
    modify_field(ethernet.srcAddr, ethernet.dstAddr);
    modify_field(ethernet.dstAddr, macAddr);
    modify_field(ipv4.ttl, ipv4.ttl-1);
}

action broadcast() {
    modify_field(intrinsic_metadata.mcast_grp, 1);
    count(debug, 0);
}

// table used to forward arp packets

table arp {
    actions {_nop; broadcast;}
}

// table used to match tcp flows

table tcpMatch {
    reads {
        ipv4.srcAddr : exact;
        ipv4.dstAddr : exact;
        tcp.srcPort  : exact;
        tcp.dstPort  : exact;    
    }
    actions {
        _nop; _drop; 
        ipv4_forward;
    }
}

// table used to match udp flows

table udpMatch {
    reads {
        ipv4.srcAddr : exact;
        ipv4.dstAddr : exact;
        udp.srcPort  : exact;
        udp.dstPort  : exact;    
    }
    actions {
        _nop; _drop; 
        ipv4_forward;
    }
}

// control flow

control ingress {
    if (ethernet.etherType == 0x0806) {
        apply(arp);
    }

    if (ipv4.protocol == IP_PROTOCOLS_TCP) {
        apply(tcpMatch);
    } else if (ipv4.protocol == IP_PROTOCOLS_UDP) {
        apply(udpMatch);
    }
}

control egress {
}

The applied control rules:

table_set_default arp broadcast
table_set_default tcpMatch _nop
table_set_default udpMatch _nop

Then I start mininet to simulate the topologic which has 4 hosts and 1 switch. I start ping between h1 and h2 and use simple_switch_CLI to check the number of processed arp packets.

$ ./simple_swicth_CLI --thrift-port 22222

RuntimeCmd: counter_read debug 0

And it generates this exception:

Traceback (most recent call last):
  File "./simple_switch_CLI", line 31, in <module>
    sswitch_CLI.main()
  File "/usr/local/lib/python2.7/dist-packages/sswitch_CLI.py", line 93, in main
    SimpleSwitchAPI(args.pre, standard_client, mc_client, sswitch_client).cmdloop()
  File "/usr/lib/python2.7/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib/python2.7/cmd.py", line 221, in onecmd
    return func(arg)
  File "/usr/local/lib/python2.7/dist-packages/runtime_CLI.py", line 585, in handle
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/runtime_CLI.py", line 1808, in do_counter_read
    value = self.client.bm_counter_read(0, counter_name, index)
  File "/usr/local/lib/python2.7/dist-packages/bm_runtime/standard/Standard.py", line 1881, in bm_counter_read
    return self.recv_bm_counter_read()
  File "/usr/local/lib/python2.7/dist-packages/bm_runtime/standard/Standard.py", line 1895, in recv_bm_counter_read
    (fname, mtype, rseqid) = iprot.readMessageBegin()
  File "build/bdist.linux-x86_64/egg/thrift/protocol/TProtocolDecorator.py", line 32, in <lambda>
  File "build/bdist.linux-x86_64/egg/thrift/protocol/TProtocolDecorator.py", line 39, in _wrap
  File "build/bdist.linux-x86_64/egg/thrift/protocol/TBinaryProtocol.py", line 126, in readMessageBegin
  File "build/bdist.linux-x86_64/egg/thrift/protocol/TBinaryProtocol.py", line 206, in readI32
  File "build/bdist.linux-x86_64/egg/thrift/transport/TTransport.py", line 58, in readAll
  File "build/bdist.linux-x86_64/egg/thrift/transport/TTransport.py", line 159, in read
  File "build/bdist.linux-x86_64/egg/thrift/transport/TSocket.py", line 120, in read
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

I have no idea about this situation. Could you help me with this? Thank you very much!

antoninbas commented 7 years ago

Please provide the entire P4 program, so I can try to reproduce the issue.

chenxiang2019 commented 7 years ago

@antoninbas Hi, Antonin. I have published my entire P4 program at here and I append the whole steps which I used to reproduce this problem. Thanks for your help!

chenxiang2019 commented 7 years ago

BTW, I am trying to fulfill the table_set_timeout to control the switch behaviours at runtime and I realize that the minimal timeout it could set is 1ms. I have these questions: 1.Is there an approach to set more precise timeout interval(e.g. 1ns)? 2.I check the table information and found that the timeout entries have not been deleted yet. I'm puzzled with this situation. Based on my comprehension, these entries should not work any more and be deleted as soon as possible. 3.Does the timeout function only maintained in bmv2? In other words, if I got a physical P4 switch, could it support timeout?

Thank you very much!

antoninbas commented 7 years ago

Regarding your error. The switch is crashing because intrinsic_metadata.egress_rid is not defined and this field is needed for multicast. Please see https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md#intrinsic_metadata-header. I recommend that you define all intrinsic metadata fields to avoid such issues. Note that if you want to broadcast a packet, you will need to configure the multicast group properly using the runtime CLI.

antoninbas commented 7 years ago

Regarding your follow-up questions:

  1. Not on bmv2; it wouldn't mean much on this software switch anyway.
  2. Entry timeout just means that a notification will be sent if the entry hasn't been hit in a while; it doesn't mean that the entry will be automatically deleted. In the P4 world, the dataplane doesn't update match-action tables (add entry / delete entry) on its own. It needs the intervention of the control-plane. This thread talks about bmv2 notifications: https://github.com/p4lang/behavioral-model/issues/211.
  3. Hardware targets that claim to support P4_14 should indeed support timeout. This is the case for the Barefoot Networks' Tofino switch. However, note that the interface that you use to configure entry timeout (runtime CLI) and receive notifications (nanomsg) is bmv2-specific in this case.
chenxiang2019 commented 7 years ago

@antoninbas Thank you! Your answers perfectly resolve my problems.