m-lab / etl

M-Lab ingestion pipeline
Apache License 2.0
22 stars 7 forks source link

NDT5 download a.minRTT is suspect #1094

Open mattmathis opened 2 years ago

mattmathis commented 2 years ago

Dave Clark reported that a.minRTT does not agree with other sources of RTT data. In particular _internal202205.raw.S2C.TCPInfo.MinRTT is often smaller.

# MinRTT test query

SELECT 
  a.MinRTT,
  _internal202205.raw.S2C.TCPInfo.MinRTT / 1000.0 AS TiMinRTT,
FROM `mlab-collaboration.mm_preproduction.extended_ndt5_downloads`
WHERE date = '2022-06-1'
LIMIT 1000

I did not check other tables.

Note that https://github.com/m-lab/etl/issues/945 applies

stephen-soltesz commented 2 years ago

There are indeed multiple RTT values. The second one below is what's being used in the a.MinRTT calculation: https://github.com/m-lab/etl/blob/master/parser/ndt5_result.go#L171

I'm puzzled why there is no fraction from these..

SELECT 
  a.MinRTT,
  _internal202205.raw.S2C.TCPInfo.MinRTT / 1000.0 AS TiMinRTT,
  _internal202205.raw.S2C.MinRTT / 1000.0 / 1000.0 AS TiMinRTT2,

FROM `mlab-collaboration.mm_preproduction.extended_ndt5_downloads`
WHERE date = '2022-06-01'
LIMIT 1000
stephen-soltesz commented 2 years ago

Looks like raw.S2C.MinRTT is the synthetic "minrtt" calculated and sent to the client to emulate the original web100 minrtt behavior...

https://github.com/m-lab/ndt-server/blob/3fc9f3d218c98e919798f58738734eff67b4aed3/ndt5/web100/web100_linux.go#L20-L35

stephen-soltesz commented 2 years ago

Hmm, was S2C.TCPInfo.MinRTT always there? it may not have been in earlier versions of ndt5...

stephen-soltesz commented 2 years ago

So, rather than using the synthetic minrtt unconditionally, we could:

Or, only use TCPInfo when available, and mark a.MinRTT undefined otherwise until it can be fixed in the join.

mattmathis commented 2 years ago

The kernel has similar logic: if BBR is present the kernel reports its value for MirRTT.

stephen-soltesz commented 2 years ago

This morning we agreed that:

stephen-soltesz commented 7 months ago

The summary MinRTT was calculated incorrectly from microseconds to seconds (rather than milliseconds). This is likely due to confusion from the mixture of sources and types in the NDT5 result structure, including:

This bug affects all parsed data from June 18 2020 to present until the fix is deployed and the historical data reprocessed.

SELECT
  a.MinRTT,
  raw.S2C.MinRTT,
  raw.S2C.TCPInfo.MinRTT
FROM `mlab-sandbox.ndt.ndt5`
WHERE date = "2023-11-20" and raw.S2C is not null
LIMIT 1000
Row MinRTT (s not ms) MinRTT_1 (ns) MinRTT_2 (us)
1 0.041616 41000000 41617
2 0.041675 41000000 41675
3 0.041571 41000000 41571
4 0.041344 41000000 41344
5 0.041371 41000000 41371
6 ...
stephen-soltesz commented 7 months ago

After https://github.com/m-lab/etl/pull/1129 is merged: