network-quality / goresponsiveness

A draft-ietf-ippm-responsiveness client in Go.
GNU General Public License v2.0
133 stars 10 forks source link

Question regarding "Quality Attenuation Statistics" #55

Closed moeller0 closed 1 year ago

moeller0 commented 1 year ago
goresponsiveness: quality-attenuation
07-01-2023 09:21:11 UTC Go Responsiveness to mensura.cdn-apple.com:443...
Quality Attenuation Statistics:
Number of losses: 0
Number of samples: 517
Loss: 0.000000
Min: 0.018014
Max: 1.317574
Mean: 0.237797 
Variance: 0.076088
Standard Deviation: 0.275840
PDV(90): 0.640972
PDV(99): 1.101985
P(90): 0.658986
P(99): 1.119999
RPM:   175 (P90)
RPM:   566 (Double-Sided 10% Trimmed Mean)
Download:  78.750 Mbps (  9.844 MBps), using 26 parallel connections.
Upload:    30.499 Mbps (  3.812 MBps), using 26 parallel connections.
Extended Statistics:
    Maximum Segment Size: 1208
    Total Bytes Retransmitted: 15080
    Retransmission Ratio: 1.05%
    Total Bytes Reordered: 161149279
    Average RTT: 39

real    0m26.547s
user    0m2.325s
sys 0m3.965s

Here the Quality Attenuation Statistics report zero losses, yet the TCP_info gives Total Bytes Retransmitted: 15080, Retransmission Ratio: 1.05% indicating that these two numbers might not cover the same set of samples (or that one of the is incorrect). Could this be made more explicit in the output assuming my first interpretation is correct?

Also I like the current draft still recommends a default trimmed mean of 95%, which would mean ((Double-Sided 5% Trimmed Mean)) if I understand the nomenclature correctly. I do think that unlike the draft recommendation doing a double sided trimmed mean is the better approach, since only cutting of the right tail will artificially decrease the reported statistic (now I accept that reaction time distributions will likely have a positive skew so removing the lowest 5% samples will have less effect that the highest 5%, but still it feels considerably cleaner to use a double sided trimmed mean*). So my preference would actually be to change the internet draft recommendation. That said draft and goresponsiveness should IMHO not disagree with each other for longer periods of time, so which side should be adjusted?

*) My rule of thumb for such things is: "If I had to describe this in a paper, what kind of method section rationale would I prefer to write, one that is obviously unbiased or one where I would need to first convince myself and then the reviewer/reader the operation is free of side effects".

hawkinsw commented 1 year ago

@moeller0 Thank you, as usual, for the issue. I believe that some of this is addressed in the ietf02 branch that we are close to merging (which will bring goresponsiveness more inline with the RFC). That said, I will make sure that all your comments are addressed.

As for the attenuation statistics, I will have to rope in @bjornite because that is his area of expertise.

Thank you, again!

bjornite commented 1 year ago

Good point, thanks. The loss percentage is calculated based on the latency measurements, which in their current form do not actually measure loss.

I will submit a pull request removing the printing of loss percentage, at least until we implement a way to accurately measure it.

moeller0 commented 1 year ago

Question: ´´´ fmt.Printf( `Number of losses: %d Number of samples: %d Min: %.6f s Max: %.6f s Mean: %.6f s ´´´ The initial 'Number of losses:' is still reported (as zero). I do not claim I fully understand this, but I guess this should also be dropped from the report, no?

Here is the start of the output from a pull a few minutes ago (20230721 around noon central European daylight savings time): ´´´ pulling the current main branch from https://github.com/network-quality/goresponsiveness remote: Enumerating objects: 6, done. remote: Counting objects: 100% (6/6), done. remote: Compressing objects: 100% (4/4), done. remote: Total 4 (delta 2), reused 0 (delta 0), pack-reused 0 Unpacking objects: 100% (4/4), 1.27 KiB | 35.00 KiB/s, done. From https://github.com/network-quality/goresponsiveness 4a3b5c9..af7dd5f main -> origin/main Updating 4a3b5c9..af7dd5f Fast-forward networkQuality.go | 2 -- 1 file changed, 2 deletions(-) getting go dependencies? building current version test-run the current gorresponsiveness version goresponsiveness: --quality-attenuation --relative-rpm 07-21-2023 10:08:40 UTC Go Responsiveness to mensura.cdn-apple.com:443... Baseline RPM: 3109 (P90) Baseline RPM: 3361 (Single-Sided 5% Trimmed Mean) Download: 91.979 Mbps ( 11.497 MBps), using 9 parallel connections. Extended Statistics: Maximum Path MTU: 1492 Maximum Send MSS: 1208 Maximum Recv MSS: 1208 Total Retransmissions: 0 Total Reorderings: 27 Average RTT: 42228.444444444445

Quality Attenuation Statistics: Number of losses: 0 Number of samples: 1001 Min: 0.098912 s Max: 0.767254 s Mean: 0.234958 s Variance: 0.006878 s Standard Deviation: 0.082936 s PDV(90): 0.246270 s PDV(99): 0.464447 s P(90): 0.345182 s P(99): 0.563360 s RPM: 255 Gaming QoO: 0 ´´´

As I mentioned 'Number of losses: 0' is still reported...

Next question, @bjornite what is 'Gaming QoO' and is it expected to be zero?

bjornite commented 1 year ago

I agree, thanks for catching that. It makes sense to remove the loss count as well.

The QoO score is calculated according to this ID: https://datatracker.ietf.org/doc/draft-olden-ippm-qoo/

TL;DR: It's a linear score between "perfect" and "useless" where both "perfect" and "useless" is defined in terms of latency percentiles. For example: if perfect p90 = 50 ms and useless p90 = 150 ms, then a p90 of 100 ms would yield a QoO score of 50 (it's on a scale of 0 - 100).

For the Gaming QoO I've used the following requirement: p50: [30ms , 150ms] p90: [65ms, 200ms] p99: [100ms, 250ms]

The QoO calculation finds the linear 0-100 score between these "perfect" and "useless" values for each percentile, and then picks the worst one. Better than or equal to "perfect" gives a QoO of 100, and worse than or equal to "useless" gives a QoO score of 0.

moeller0 commented 1 year ago

OK, so the score of zero denotes 'useless' I guess, given that the reported percentiles seem to be out of bounds. NOTE: if p(50) is taken into calculations, it might make sense to report it as well (whether you call it p(50) or median will not matter ATM only the mean is reported). QUESTION: why is the QoO RPM value different from the networkQuality RPM reported later? If this is on purpose maybe add a qualifier to the output like: QoO RPM