m-lab / uuid-annotator

Produces metadata locally for every connection on each server.
Apache License 2.0
0 stars 0 forks source link

Annotations support data provenance questions #32

Open mattmathis opened 4 years ago

mattmathis commented 4 years ago

It must be possible to distinguish between new annotations applied to old unannotated data in the etl pipeline and annotations circa data collection.

Consider making the annotation include: the SHA of the maxmind DB, and the timestamp when it was applied to the data. (eg. approximately measurement time or approximately parse time).

pboothe commented 4 years ago

Highest priority piece: distinguish between data derived from real-time annotator and etl annotator

mattmathis commented 2 years ago

Better phrasing: We need to archive the date (or version) of raw annotation DB databases (Maxmind, etc) independent of the date of the row. This is necessary to study the stability of address ownership and assignments.