m-lab / data-annotations

For recording human-curated annotations that have known impacts on M-Lab test data.
Apache License 2.0
1 stars 0 forks source link

Some traceroute files are incomplete #31

Open SaiedKazemi opened 2 years ago

SaiedKazemi commented 2 years ago

Some traceroutes archives do not contain all four lines that should always be in a traceroute file. Below is an example of such a file missing the tracelb and cycle-stop lines. Since these files generate a parse error, their partial data is not inserted into BigQuery.

$ gsutil cp gs://archive-measurement-lab/ndt/scamper1/2021/01/26/20210126T203900.906199Z-traceroute-mlab3-atl07-ndt.tgz . $ tar xzf 20210126T203900.906199Z-traceroute-mlab3-atl07-ndt.tgz $ jq . < ../traceroutes/2021/01/26/20210126T203831Z_ndt-ds8qp_1611471752_00000000000205B9.jsonl { "UUID": "ndt-ds8qp_1611471752_00000000000205B9", "TracerouteCallerVersion": "d6e45f1", "CachedResult": true, "CachedUUID": "ndt-ds8qp_1611471752_00000000000205AF" } { "type": "cycle-start", "list_name": "/tmp/scamperctrl:23862", "id": 1, "hostname": "ndt-ds8qp", "start_time": 1611693222 } $

Filing this issue for future reference and perhaps removing them from the archive because they have no value and do not serve a useful purpose.

SaiedKazemi commented 2 years ago

@stephen-soltesz FYI.