m-lab / traceroute-caller

A sidecar service which runs traceroute after a connection closes
Apache License 2.0
18 stars 5 forks source link

traceroute-caller sometimes calls traceroute twice for a given connection #25

Open pboothe opened 5 years ago

pboothe commented 5 years ago

The following Bigquery query discovers 300+ times traceroute-caller was called multiple times for a given UUID:

with 
uuids as (
  SELECT COUNT(*) as count, uuid, Parseinfo.TaskFileName as fname
  FROM `mlab-staging.batch.traceroute`
  group by uuid, Parseinfo.TaskFileName
)
select uuid, fname from uuids where count > 1 and uuid != ""

This seems pretty obviously incorrect, and we should fix it. Note that the UUID appearing multiple times here is in fact correct - it's the same connection (and so the same UUID) causing multiple calls to scamper's traceroute system.

yachang commented 5 years ago

For 3 days:

SELECT

COUNT(DISTINCT uuid) AS num

FROM (

SELECT uuid

FROM mlab-staging.batch.traceroute

WHERE DATE(_PARTITIONTIME) BETWEEN DATE("2019-08-10") AND DATE("2019-08-12") AND uuid != "" )

return 1353774

SELECT

COUNT(uuid) AS num

FROM (

SELECT uuid

FROM mlab-staging.batch.traceroute

WHERE DATE(_PARTITIONTIME) BETWEEN DATE("2019-08-10") AND DATE("2019-08-12") AND uuid != "" )

return 1353800

The difference is 26 for 3 days, small enough to lower the priority to P2.