esnet / pond

Immutable timeseries data structures built with Typescript
http://software.es.net/pond
Other
204 stars 37 forks source link

Strong-type data points, avoid magic strings #166

Open omidkrad opened 4 years ago

omidkrad commented 4 years ago

It is great that this project is being converted to TypeScript. I wanted to suggest to use generics for strong-typing data points in time series, and avoid magic strings as much as possible. I created this commit to show how the interfaces could be updated to do that (it does not include the implementation):

Commit: Add generic type for data points

With that, as an example, avg function could be updated to support this signature:

avg(key: (data: TData) => number, filter?: any): number;

and be used like below:

import { timeSeries } from "pondjs";

interface SensorDataPoint {
    time: number;
    sensor: number;
    status: string;
}

const series = timeSeries<SensorDataPoint>({
    name: "sensor_data",
    columns: ["time", "sensor", "status"], // checked, non-existent keys like "status1" will give error
    points: [
        [1400425947000, 3, "ok"],
        [1400425948000, 8, "high"],
        [1400425949000, 2, "low"],
        [1400425950000, 3, "ok"],
    ]
});

// by field
series.avg(d => d.sensor);   // 4

// calculation
series.avg(d => d.sensor + d.sensor2 * 100);

// field path
series.avg(d => d.sensor.diningRoom);

I believe function signatures that receive magic strings should be deprecated in favor of this new signature.

pjm17971 commented 4 years ago

Firstly, thank you for taking the time to think about this and write this up. I like this suggestion as I've been contemplating removing the fieldSpec type references myself, both because of typing issues and also the complexity it scatters though the codebase. I'm in favor of doing something along these lines.

The two options I see to address the typing and complexity issues are:

I've started a branch to explore option (a) as this is a less radical change. When I ported Pond to Typescript originally that was the obvious thought (while I was generically typing the key, it made sense to generically type the data), but there was some obstacles that where too much of a lift at the time in terms of what that meant across the code. One such obstacle was how avg() type operations would work, operations that essentially used get() with a fieldSpec would work, and your suggestion for those is elegant. So let's try that.

There was a couple of other concerns that still would need to be solved or at least consider the trade off:

Anyway, I'm mostly replying to thank you for taking the time to make this suggestion. I'll continue to look into this.

pjm17971 commented 4 years ago

Kind of a long way to go, but got tests for this basic functionality working with Events and Collections:

interface Sensor {
    value: number;
    status: string;
}

const c = collection<Time, Sensor>(
    Immutable.List([
        event(time("2015-04-22T03:30:00Z"), { value: 32, status: "ok" }),
        event(time("2015-04-22T02:30:00Z"), { value: 42, status: "fail" })
    ])
);

c.avg(d => d.value);

// 37

c.toJSON(({ value, status }) => ({ value, ok: status === "ok" ? true : false }))

// [
//   { time: 1429673400000, data: { value: 32, ok: true } },
//   { time: 1429669800000, data: { value: 42, ok: false } }
// ]
omidkrad commented 4 years ago

That's great. I'm not very familiar with Pond but when I was reviewing it, I liked the project but thought the API needed that kind of improvement. It's great to see my suggestion was received well.