Closed jadamcrain closed 6 years ago
We have decided to start the implementation using PostgreSQL numeric type as the value column. It is able to represent a signed/unsigned integer, a floating point value, and a enum (once converted to an unsigned integer). Here's the SQL script to create the table:
CREATE TABLE data(
message_uuid uuid not null,
timestamp timestamptz not null,
device_uuid uuid not null,
tagname varchar(1024) not null,
value numeric not null,
PRIMARY KEY (message_uuid, device_uuid, tagname)
);
It will probably need some indexing. TimescaleDB has some recommendations here: https://docs.timescale.com/v0.9/using-timescaledb/schema-management#indexing
We have a schema strawman from Duke now that differs, so we can close this issue.
@emgre is investigating a good schema for timescale. This ticket is just to document our discussion and his findings.
We're hoping to have a single table for all measurement values. There will be the following (and possibly more) columns:
[timestamp] [tag name] [device mRID] [message mRID] [ ... measurement columns]
timestamp is the message timestamp. It is always located near the top of protobuf message inside a MessageInfo object.
tag name will be formed by walking the hierarchy names from the root of message down to the field of interest. These are guaranteed to be unique for a particular message type. We can leverage the generated C++ message visitors both for determining the tag names and deciding what fields we want to extract from each message.
device mRID is a UUID included in each message that identifies the originating device. It's on the ConductingEquipment class.
message mRID is the unique identifier for the message itself and is different for each message. It is also found on the "MessageInfo" class.
Finally we need to decide how we store the various measurement/control values themselves. There are various strategies we need to explore, including: optional strongly typed columns (bool, int, float, etc) vs a single string column with enum descriminator. What are the pros/cons of each approach?