google / mobly-snippet-lib

Mobly Snippet Lib is a library for triggering device-side code from host-side Mobly tests.
Apache License 2.0
40 stars 22 forks source link

Rewrite RPC stack to be more reliable #77

Open kdart opened 7 years ago

kdart commented 7 years ago

I'm seeing too many of these:

Details: No response from server. Stack trace: Traceback (most recent call last): File "mobly/base_test.py", line 356, in exec_one_test test_method() ...
File "mobly/controllers/android_device_lib/jsonrpc_client_base.py", line 285, in rpc_call return self._rpc(name, *args) File "mobly/controllers/android_device_lib/jsonrpc_client_base.py", line 265, in _rpc raise ProtocolError(ProtocolError.NO_RESPONSE_FROM_SERVER) ProtocolError: No response from server.

JSON-RPC is not a good protocol. Something better, probably making use of protobuf, would be a good start. The Python side implementation also needs improvement, since it doesn't support partial reads that has too long a delay between messages.

kdart commented 7 years ago

I'm quite sure the reliability problems are not due to the adb proxy and transport layer.

Two simultaneous UDP iperf sessions didn't show any errors.

⇒ iperf -c localhost -p 7654 -d -i 10 -t 60 -u -b 16M
------------------------------------------------------------
Server listening on UDP port 7654
Receiving 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to localhost, UDP port 7654
Sending 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  5] local 127.0.0.1 port 40791 connected with 127.0.0.1 port 7654
[  4] local 127.0.0.1 port 7654 connected with 127.0.0.1 port 40791
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec  19.1 MBytes  16.0 Mbits/sec
[  4]  0.0-10.0 sec  19.1 MBytes  16.0 Mbits/sec   0.018 ms    0/13605 (0%)
[  5] 10.0-20.0 sec  19.1 MBytes  16.0 Mbits/sec
[  4] 10.0-20.0 sec  19.1 MBytes  16.0 Mbits/sec   0.008 ms    0/13606 (0%)
[  5] 20.0-30.0 sec  19.1 MBytes  16.0 Mbits/sec
[  4] 20.0-30.0 sec  19.1 MBytes  16.0 Mbits/sec   0.011 ms    0/13605 (0%)
[  5] 30.0-40.0 sec  19.1 MBytes  16.0 Mbits/sec
[  4] 30.0-40.0 sec  19.1 MBytes  16.0 Mbits/sec   0.013 ms    0/13605 (0%)
[  5] 40.0-50.0 sec  19.1 MBytes  16.0 Mbits/sec
[  4] 40.0-50.0 sec  19.1 MBytes  16.0 Mbits/sec   0.016 ms    0/13606 (0%)
[  5] 50.0-60.0 sec  19.1 MBytes  16.0 Mbits/sec
[  5]  0.0-60.0 sec   114 MBytes  16.0 Mbits/sec
[  5] Sent 81634 datagrams
[  4] 50.0-60.0 sec  19.1 MBytes  16.0 Mbits/sec   0.013 ms    0/13605 (0%)
[  4]  0.0-60.0 sec   114 MBytes  16.0 Mbits/sec   0.015 ms    0/81633 (0%)
[  4]  0.0-60.0 sec  1 datagrams received out-of-order
[  5] Server Report:
[  5]  0.0-60.0 sec   114 MBytes  16.0 Mbits/sec   0.014 ms    0/81633 (0%)
[  5]  0.0-60.0 sec  1 datagrams received out-of-order

Other terminal:

⇒ iperf -c localhost -p 7654 -d -i 10 -t 60 -u -b 16M 
------------------------------------------------------------
Server listening on UDP port 7654
Receiving 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to localhost, UDP port 7654
Sending 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  4] local 127.0.0.1 port 41052 connected with 127.0.0.1 port 7654
[  3] local 127.0.0.1 port 7654 connected with 127.0.0.1 port 41052
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  19.1 MBytes  16.0 Mbits/sec
[  3]  0.0-10.0 sec  19.1 MBytes  16.0 Mbits/sec   0.011 ms    0/13605 (0%)
[  4] 10.0-20.0 sec  19.1 MBytes  16.0 Mbits/sec
[  3] 10.0-20.0 sec  19.1 MBytes  16.0 Mbits/sec   0.012 ms    0/13605 (0%)
[  4] 20.0-30.0 sec  19.1 MBytes  16.0 Mbits/sec
[  3] 20.0-30.0 sec  19.1 MBytes  16.0 Mbits/sec   0.020 ms    0/13606 (0%)
[  4] 30.0-40.0 sec  19.1 MBytes  16.0 Mbits/sec
[  3] 30.0-40.0 sec  19.1 MBytes  16.0 Mbits/sec   0.003 ms    0/13605 (0%)
[  4] 40.0-50.0 sec  19.1 MBytes  16.0 Mbits/sec
[  3] 40.0-50.0 sec  19.1 MBytes  16.0 Mbits/sec   0.011 ms    0/13606 (0%)
[  4] 50.0-60.0 sec  19.1 MBytes  16.0 Mbits/sec
[  4]  0.0-60.0 sec   114 MBytes  16.0 Mbits/sec
[  4] Sent 81634 datagrams
[  3] 50.0-60.0 sec  19.1 MBytes  16.0 Mbits/sec   0.029 ms    0/13605 (0%)
[  3]  0.0-60.0 sec   114 MBytes  16.0 Mbits/sec   0.028 ms    0/81633 (0%)
[  3]  0.0-60.0 sec  1 datagrams received out-of-order
[  4] Server Report:
[  4]  0.0-60.0 sec   114 MBytes  16.0 Mbits/sec   0.027 ms    0/81633 (0%)
[  4]  0.0-60.0 sec  1 datagrams received out-of-order
xpconanfan commented 7 years ago

Two things

  1. Use tcp
  2. Does iperf reopen the socket on every send?
kdart commented 7 years ago
  1. I did use TCP, but didn't show it here. The TCP resend will mask dropped packet problems. So UDP is better for that test.

  2. No, I'm pretty sure it reuses the same socket for this test.

xpconanfan commented 7 years ago

Oh, you did a simple two parallel iperf stream... That's not an accurate simulation of what snippet does...

On Sep 14, 2017 1:56 PM, "Keith Dart" notifications@github.com wrote:

1.

I did use TCP, but didn't show it here. The TCP resend will mask dropped packet problems. So UDP is better for that test. 2.

No, I'm pretty sure it reuses the same socket for this test.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/mobly-snippet-lib/issues/77#issuecomment-329606519, or mute the thread https://github.com/notifications/unsubscribe-auth/ACI25cbo11_DPUwmgkMwRoSZXVRPi9urks5siZLkgaJpZM4PX3rh .

kdart commented 7 years ago

Exactly. ;-) This test mostly verifies basic channel reliability, eliminating that as a possible cause.

xpconanfan commented 7 years ago

In parallel briefly is not a problem, otherwise your test would never have worked. The thing is they are flaky.

Also you are using the same port. We use two separate adb forwarding ports...

kdart commented 7 years ago

OK, for ten simultaneous streams over ten different ports (ten forwards) over 30 minutes shows no performance problems.

device prep:

# for port in $(seq 7650 7659) ; do iperf3 -4 -s -p $port -D ; done

Forwards:

⇒ adb forward --list
FA71A0303542 tcp:18998 tcp:8080
FA71A0303542 tcp:20271 tcp:8080
HT73Y0201123 tcp:7654 tcp:7654
HT73Y0201123 tcp:7650 tcp:7650
HT73Y0201123 tcp:7651 tcp:7651
HT73Y0201123 tcp:7652 tcp:7652
HT73Y0201123 tcp:7653 tcp:7653
HT73Y0201123 tcp:7655 tcp:7655
HT73Y0201123 tcp:7656 tcp:7656
HT73Y0201123 tcp:7657 tcp:7657
HT73Y0201123 tcp:7658 tcp:7658
HT73Y0201123 tcp:7659 tcp:7659

host side (client):

$ for port in $(seq 7650 7659) ; do (iperf3 -4 -c localhost -p $port --logfile /tmp/iperf3-c-${port}-l -t 1800 )& ; done

Summary:

⇒ grep sender /tmp/iperf3-c-765?-l
/tmp/iperf3-c-7650-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    1             sender
/tmp/iperf3-c-7651-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    1             sender
/tmp/iperf3-c-7652-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    1             sender
/tmp/iperf3-c-7653-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    0             sender
/tmp/iperf3-c-7654-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    1             sender
/tmp/iperf3-c-7655-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    0             sender
/tmp/iperf3-c-7656-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    2             sender
/tmp/iperf3-c-7657-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.7 Mbits/sec    1             sender
/tmp/iperf3-c-7658-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    1             sender
/tmp/iperf3-c-7659-l:[  6]   0.00-1800.00 sec  7.26 GBytes  34.6 Mbits/sec    0             sender

Worst case TCP retransmission was 2 packets.

So, it looks to me that the adb transport performs well and is not a problem. Especially given that the snippet traffic is usually quite small.

Those retransmissions cause a slight delay in message reception and it's the host side client implementation that can't tolerate that delay.

xpconanfan commented 7 years ago

Cool. Good to know! Now let's see how we can fix our stuff! :D