Open gfr10598 opened 7 years ago
#standardSQL SELECT count(test_id) as cnt, connection_spec.client_af, web100_log_entry.connection_spec.local_af, web100_log_entry.snap.LocalAddressType, length(NET.SAFE_IP_FROM_STRING(web100_log_entry.snap.RemAddress)) as remAddrLen, length(NET.SAFE_IP_FROM_STRING(connection_spec.client_ip)) as clientIpLen, anomalies.no_meta FROM
mlab-sandbox.mlab_sandbox.ndt where _partitiontime >= timestamp('2017-10-01') and parse_time >= timestamp('2017-10-01 00:00:00 UTC') group by web100_log_entry.snap.LocalAddressType, connection_spec.client_af, web100_log_entry.connection_spec.local_af, remAddrLen, clientIpLen, no_meta order by web100_log_entry.snap.LocalAddressType, connection_spec.client_af, web100_log_entry.connection_spec.local_af, remAddrLen, clientIpLen, no_meta
Matt points out that this likely means that the control connection is one AF, and the test connection is a different AF, so this is not inconsistent. But it IS interesting.
While investigating a different issue I noticed some rows with inconsistent connection_spec_client_af and connection_spec_server_af. Selected fields from one such row: 1 web100_log_entry_connection_spec_local_af 0 2 web100_log_entry_connection_spec_local_ip 2001:5a0:3801::11 4 web100_log_entry_connection_spec_remote_ip 2601:586:201:294:9d8c:e16c:c75c:285a 38 web100_log_entry_snap_LocalAddress 2001:5a0:3801::11 39 web100_log_entry_snap_LocalAddressType 2 73 web100_log_entry_snap_RemAddress 2601:586:201:294:9d8c:e16c:c75c:285a 120 test_id 2017/08/19/20170819T01:46:52.221342000Z_2601:586:201:294:9d8c:e16c:c75c:285a:54362.c2s_snaplog.gz 129 connection_spec_client_af 2 132 connection_spec_client_hostname c-73-179-194-213.hsd1.fl.comcast.net 133 connection_spec_client_ip 73.179.194.213 138 connection_spec_server_af 0 139 connection_spec_server_hostname mlab1.mia03.measurement-lab.org 140 connection_spec_server_ip 2001:5a0:3801::11
Unfortunately DNS does not currently have mappings for the name or addresses. It appears that NDT somehow captured the hostname, did a lookup and then chose the IPv4 address rather than the IPv6 address. This could be due to using different protocols for the measurement and control connections or merely back to back reverse and forward DNS lookups. Note that all af, Type, and length fields should agree, because you can not use different protocols for the client and server. (But they use different encodings).
In data from first two weeks of October 2017, we have the following inconsistent ipv4/6 info:
Rows 4 and 6 have conflicting address types in the client_ip field, and the RemAddress field. One comes generally from the .meta file, and the other from Web100 data.