m-lab / uuid-annotator

Produces metadata locally for every connection on each server.
Apache License 2.0
0 stars 0 forks source link

UUID-Annotator

Version Build Status Coverage Status GoDoc Go Report Card

A system for generating and saving per-connection metadata in real-time on M-Lab's edge systems.

Design

It generates a JSON file for every connection containing the geolocation and network location metadata for the IP addresses in the connection, and eventually adds in all other annotations concerning the "local environment" as well.

The datatype it generates will be "annotation" and it will generate filenames like:

    /ndt/annotation/2009/03/18/${UUID}.json

where ${UUID} is the actual UUID of the connection under consideration. It will follow both our uniform names best-practices and pusher best-practices.

The columns in the JSON file will initially be a subset of our standard columns:

Later versions can (and should!) add columns that include real-time switch counters, local machine load, and other indicators of measurement quality, but v1 will concentrate on location data. Each new column added to the annotator output should be added to our set of standard columns.

The location annotation service will read from a MaxMind file served up via a file stored in a GCS bucket. It will periodically poll (in a memoryless manner) to discover whether the file has changed.

Performance

This service will depend on tcp-info's UUID notification service, but no local service should depend on the annotator. As such, we do not need to worry about the annotator slowing down an integrated service, we only need to worry about the annotator keeping up with the creation rate of TCP connections. We do not anticipate that being too difficult.

Availability

This service is a core service and needs to be highly available, just like tcp-info, packet-headers, traceroute-caller, and DISCO. It represents our one chance to annotate UUIDs with metadata. As such, the health of the experiment service should depend on the health of the UUID annotation service, just like it should depend on the other core services.

Usage

Stand-alone

If only the local ipservice socket is needed to provide annotations for specific IPs, the uuid-annotator may be run in a "stand-alone" mode. This mode does not require the tcp-info -tcpinfo.eventsocket, -siteinfo.url, or -datadir flags.

docker build -t local-annotator .
docker run -v $PWD/testdata:/testdata -it local-annotator   \
    -ipservice.sock=/local/uuid-annotator.sock \
    -maxmind.url=file:///testdata/GeoLite2-City-real.tar.gz \
    -routeview-v4.url=file:///testdata/RouteViewIPv4.pfx2as.gz \
    -routeview-v6.url=file:///testdata/RouteViewIPv6.pfx2as.gz

Generate Schemas

If using uuid-annotator data as part of the autoloader pipeline, you may generate the data type schemas using the generate-schemas command:

docker run -v $PWD:/schemas --entrypoint /generate-schemas -it local-annotator \
    -ann2 /schemas/ann2.json -hop2 /schemas/hop2.json