m-lab / etl

M-Lab ingestion pipeline
Apache License 2.0
22 stars 7 forks source link

Higher speeds from 2.0 vs. 1.0 #860

Open laiyi-ohlsen opened 4 years ago

laiyi-ohlsen commented 4 years ago

User reported issue:

For Platform 2.0, the upload speeds are slightly higher, but only by about 10-15%, and they immediately plateau again. Note that on the two charts, the Y axis is different!

Note that my download speeds are slightly higher than the M-Lab viz on the Web — the viz shows 17.4 at the first of November 2019. I had about 22.5 for the month of October 2019. The difference may be that I use the maximum speed test per day per client which would likely lead to a more optimistic number.

Screen Shot 2020-04-05 at 7 40 01 PM Screen Shot 2020-04-05 at 7 41 04 PM

Screen Shot 2020-04-05 at 8 29 04 PM
mlab-code-reviews commented 4 years ago

Quick note: Ideally we should always prefer plotting with log Y axis, and never average across clients, as that gives far too much weight to fast clients.

On Mon, Apr 6, 2020 at 12:42 PM laiyi-ohlsen notifications@github.com wrote:

User reported issue:

For Platform 2.0, the upload speeds are slightly higher, but only by about 10-15%, and they immediately plateau again. Note that on the two charts, the Y axis is different!

Note that my download speeds are slightly higher than the M-Lab viz on the Web — the viz shows 17.4 at the first of November 2019. I had about 22.5 for the month of October 2019. The difference may be that I use the maximum speed test per day per client which would likely lead to a more optimistic number.

[image: Screen Shot 2020-04-05 at 7 40 01 PM] https://user-images.githubusercontent.com/12607492/78582874-fc383b00-7803-11ea-9765-9c5f9bf7515a.png [image: Screen Shot 2020-04-05 at 7 41 04 PM] https://user-images.githubusercontent.com/12607492/78582877-fcd0d180-7803-11ea-9cdc-d291794aa8a7.png [image: Screen Shot 2020-04-05 at 8 29 04 PM] https://user-images.githubusercontent.com/12607492/78582879-fd696800-7803-11ea-8932-249df608439f.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/m-lab/etl/issues/860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDGT573DAKIUB2XCVJBZPTRLIBBBANCNFSM4MCNTXJA .

-- To unsubscribe from this group and stop receiving emails from it, send an email to code-reviews+unsubscribe@measurementlab.net.

-- Greg Russell / Measurement-Lab https://memegen.googleplex.com/4558349824688128

BobBallance commented 4 years ago

I typically take the median across the clients, and the max per day within the client. You can see the mean across clients, but that's not the recommended view within I3.

mlab-code-reviews commented 4 years ago

Got it. Didn't know who made the plots, and like to remind people that averages are very sensitive to how many fast clients are running tests.

On Mon, Apr 6, 2020 at 1:29 PM Bob Ballance notifications@github.com wrote:

I typically take the median across the clients, and the max per day within the client. You can see the mean across clients, but that's not the recommended view within I3.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/m-lab/etl/issues/860#issuecomment-609931330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHDGT57K2JXWYCRKELDVC7TRLIGPZANCNFSM4MCNTXJA .

-- To unsubscribe from this group and stop receiving emails from it, send an email to code-reviews+unsubscribe@measurementlab.net.

-- Greg Russell / Measurement-Lab https://memegen.googleplex.com/4558349824688128

gfr10598 commented 4 years ago

This probably belongs in another repo. etl just does the parsing, and this is more associated with either the server or the platform or both.

mattmathis commented 4 years ago

At least one problem is in the parser/views, and legacy issues in the platform itself are not actionable.