m-lab / etl

M-Lab ingestion pipeline
Apache License 2.0
22 stars 7 forks source link

parser: ndt5 and ndt7 short-form protocol in summary records #950

Open stephen-soltesz opened 4 years ago

stephen-soltesz commented 4 years ago

Today it is difficult to discern the protocol used in the unified views for NDT.

ndt5 records the Protocol (WS, WSS, PLAIN) and MessageProtocol (TLV, JSON).

ndt7 records are implied by ServerPort only if standard ports are used (80, 443).

As a convenience to users, it would help to add a summary field for Protocol that uses abbreviations like: "ndt5+wss", "ndt7+wss", "ndt5+ws", "ndt5+plain", "ndt7+ws", "ndt4+wss", etc..

This way, users can easily aggregate or filter by protocol without ambiguity.

stephen-soltesz commented 4 years ago

Discussed this convention with @mattmathis yesterday. The primary concern was that we could select on portions of the string to distinguish between ws and wss. For example:

SELECT
   name
from (
   select "wss" as name
   union all
   select "ws" as name
)
WHERE
   name like 'ws'

returns only "ws".