openpowerquality / opq

Master repository for all OPQ services
http://openpowerquality.org
18 stars 2 forks source link

Update emilia's mongo to support new location representation #103

Closed philipmjohnson closed 6 years ago

philipmjohnson commented 6 years ago

Please do work for this task in a branch called issue-103.

Update emilia's Mongo database to associate location slugs with Trend and Event data.

Update all existing data to provide a location slug, and change processing to associate the location slug for all future data.

anthonyjchriste commented 6 years ago

I can do this as soon as @sergey-negrashov starts saving the new location slugs in the DB. That way, when we convert all the exisiting data, there is no gap between existing data and new data where the slug doesn't exists.

I have a couple of questions: 1) In the context of existing data, it seems that we should only add a slug when "legacy location" data (i.e. the "locations" field in "opq_boxes" is known correct? By that, I mean, we have trends and events associated with time ranges that don't have a legacy location associated with them. It seems to me that it would be academically dishonest to associate a location slug with a document when we don't really have any data supporting that that document came from a particular location.

2) Are we removing the "locations" field from "opq_boxes", or will we simply store location slugs and their start times here moving forward? It seems to me that it would be nice to have a history of the boxes location over time without having to query trends or events for that information.

3) Where is the proper place to store the zipcode field? Should that maybe be added to the locations collection?

philipmjohnson commented 6 years ago

Here are my thoughts:

First, the new data model for OPQ_Boxes has a location_archive field:

https://openpowerquality.netlify.com/docs/datamodel.html#opqboxes

This is where we can store legacy location information.

Second, yes, we are removing the locations field. Instead, we'll represent the current location via two fields: location and location_start_time_ms. Then, we'll use the location_archive field for legacy location data. This seems much faster and better than the old way of looking at the last element of an array to find the current location. Here's the location data model description:

https://openpowerquality.netlify.com/docs/datamodel.html#locations

Third, I am proposing that we no longer explicitly represent zip codes. They are kind of artificial. Instead, I am proposing that all location definitions must have an array containing [longitude, latitude]. If you want zip codes, you can either create a "region" manually, or use some crazy mongo plugin to map from [longitude, latitude] to zip codes. I think we can use the "region" entity to do zip codes, towns, neighborhoods, anything we like.

Here's the region data model description:

https://openpowerquality.netlify.com/docs/datamodel.html#regions

philipmjohnson commented 6 years ago
  1. Mongo will be updated with historical info at the same time that oplogs are put online.
  2. Locations are not associated with Measurements, but only with Trends, Anomalies, etc.
  3. Access the DB to find the current location of the OPQ Box each minute when you create the Trend.
  4. Add Locations to Anomalies.
  5. Where the heck is Watanabe Hall.
  6. Philip: fix the bug in unplugged.
philipmjohnson commented 6 years ago

There remains some old measurements, events, and trends with old location representation. But who cares. We're done.