Closed wilsonNg007 closed 2 years ago
@simonxiaoss , @santoshx - any idea?
I tried epoll option (-e) and did not specify particular port (no -p, so just use the default), and still seeing the same error/symptom.
As another experiment, I tried no-sync option (-N), which effectively eliminates that sync connection socket from the setup. However, the receiver side does not show any real-time throughput; only the sender shows the real-time throughput. When I exit ntttcp, the receiver does not provide any summary output at the end; only the sender side provides summary output at the end. Using console log (-o) and xml file (-x) options did not log anything to the files either. So that means we cannot tell for sure if the receiver did receive the packets as well as other important metrics like throughput rate, cycles/bytes, etc.
@LiliDeng , could you pls help take a look at my issue? I want to understand what the errors means and the impact, and what causes the error (if it's due to a potential tool's defect or if it's due to the VM network), and if there is any temporary workaround to avoid. Thanks.
You mentioned: ""error: cannot read data from socket: 3", and also reports 0 Mbps real-time tput. But the sender still keeps sending UDP traffic" Simon: It is a simple syscall "recvfrom" returned 0 bytes; so ntttcp was surprised that it did not receive any data from the socket then report this error message. Yes you also saw it from "0 Mbps".
You mentioned: "It appears the receiver host Ethernet (Task manager on the right side of attached screenshot) is still continuously receiving udp packets," Simon: Windows Task manager should get the packet numbers from the network driver. There are other layers between socket and driver. You might need to chase down to see where does the packets are dropped.
Thanks @simonxiaoss
UDP is an unreliable protocol, so I'm wondering if this is an expected behavior or not?
I noticed on the receiver size, the real-time tput rate fluctuates more frequently on a wide range over time (as opposed to the sender).
I did another experiment which reduced the transfer pkt size to 512B (from previous 1KB), and still the receiver periodically reports the same errors.
I queried the device file on the receiver (attached output below) and it doesn't have any dropped or error packets (same for the sender side). So it appears VM kernel didn't drop UDP packets. Looks like the Host is the suspect.
windows_x86_64:/data/local/tmp/test $ cat /proc/net/dev Inter-|Receive |Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 5813 55 0 0 0 0 0 0 5813 55 0 0 0 0 0 0 bond0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 dummy0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 tunl0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 gre0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 gretap0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 erspan0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ip_vti0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ip6_vti0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sit0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ip6tnl0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ip6gre0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth1: 164351009144 232460617 0 0 0 0 0 20794 1132368 2537 0 0 0 0 0 0 bond_wifi: 161110334364 232460953 0 0 0 0 0 20974 1132368 2537 0 0 0 0 0 0 wlan0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth3: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth4: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 eth5: 31132 336 0 0 0 0 0 180 0 0 0 0 0 0 0 0 eth6: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
This is root-caused to an issue on the host/vm side.
I'm using nttttcp to send UDP traffic from one VM to another (remote host) VM. I'm keeping the transfer size to 1024B which is below 1 MTU size (1280B) to avoid any fragmentation. Below are the cmd lines setup. However, after a while, the receiver periodically reports "error: cannot read data from socket: 3", and also reports 0 Mbps real-time tput. But the sender still keeps sending UDP traffic (see attached screenshot).
There are no other applications running inside both VM except ntttcp processes. The UDP packets are transferred between port 8011 on both ends.
Questions: So what are port 60558 (socket 3) on sender and port 8010 (socket 6) on receiver used for in terms of what data it's trying to communicate?
Is the "0 Mbps" real-time tput reported on the receiver side bogus? It appears the receiver host Ethernet (Task manager on the right side of attached screenshot) is still continuously receiving udp packets, which will get steered into the VM. How can I debug this to find out what's wrong with ntttcp?
Thanks.
Ntttcp Receiver
windows_x86_64:/data/local/tmp/test $ ./ntttcp -r -m 1,*,10.231.111.24 -V -p 8011 -t 86400 -u -b 1024 NTTTCP for Linux 1.4.0
** receiver role ports: 1 cpu affinity: server address: 10.231.111.24 domain: IPv4 protocol: UDP server port starting at: 8011 receiver socket buffer (bytes): 1024 test warm-up (sec): no test duration (sec): 86400 test cool-down (sec): no show system tcp retransmit: no quiet mode: disabled verbose mode: enabled
23:49:48 DBG : user limits for maximum number of open files: soft: 32768; hard: 32768 23:49:48 DBG : Interface:[lo] Address: 127.0.0.1 23:49:48 DBG : Interface:[wlan0] Address: 10.231.111.24 23:49:48 INFO: 2 threads created 23:49:48 DBG : ntttcp server is listening on 10.231.111.24:8010 23:49:54 DBG : Sync connection: 10.231.111.166:60558 --> local:8010 [socket 6] 23:49:54 INFO: Network activity progressing...
Ntttcp Sender
1|windows_x86_64:/data/local/tmp/test $ ./ntttcp -s -m 1,*,10.231.111.24 -V -p 8011 -t 86400 -u -n 1 -b 1024 NTTTCP for Linux 1.4.0
** sender role connections: 1 X 1 X 1 cpu affinity: server address: 10.231.111.24 domain: IPv4 protocol: UDP server port starting at: 8011 sender socket buffer (bytes): 1024 test warm-up (sec): no test duration (sec): 86400 test cool-down (sec): no show system tcp retransmit: no quiet mode: disabled verbose mode: enabled
23:49:52 DBG : user limits for maximum number of open files: soft: 32768; hard: 32768 23:49:52 DBG : Sync connection: local:60558 [socket:3] --> 10.231.111.24:8010 23:49:52 INFO: 1 threads created 23:49:52 DBG : Running UDP stream: local:0 [socket:4] --> 10.231.111.24:8011 23:49:52 INFO: 1 connections created in 2086 microseconds 23:49:52 INFO: Network activity progressing...