Open AndrewSisley opened 2 years ago
It's unlikely that I would work on this in the coming 2 months, but I would enjoy and recommend keeping track of progress and clarifications directly on the issue. Very cool
Really surprised and encouraged by the interest show in this idea, particularly from the more business-orientated people, thank you all!
Is a fairly simple/bare-bones (rust) array-based (local) database implementation here if anyone is interested and doesn't mind the shameful self-promotion. Concept is quite simple and is naturally quite performant. Tricky part will very likely be the commit related stuff (and making sure that doesn't trash the performance).
Also making a note that DJ knows someone with a very fun sounding background in this area, and it would probably be worth trying to check in with them if/when this idea starts to get serious.
Machine learning!
Late night thought - circle buffers could be implemented pretty easily and still fit within the gql spec - would just be an inline array type with a @circle-buffer
directive (or similar), internal start-index would be hidden, and the array would be returned as a normal array when queried (from start index to start index-1).
Map-like types could perhaps be defined as arrays of a key-value-pair struct, and could accept input parameters (e.g. WindSpeed5m(start: 10:00, end: 17:00)
) or return the entire collection. Could also be made to work with the @circle-buffer
directive.
Could also have an @ordered
directive, sorting items on insert.
And a @unique
directive perhaps (might be enough to keep this implicit when using key-value struct?).
These directives could then be combined to create circle-buffered maps of key-ordered items that can be efficiently queried by range. E.g.
type turbine {
Name: String
Windspeed10minLast24Hrs: [TimeFloat] @ordered @circle-buffer(size: 144) // 144 = 1 day
}
query {
turbine {
Name
// Take the last hour only, last keyword (and similar) could be added feature for all arrays
lastHour: Windspeed10minLast24Hrs(last: 6)
// Take the values for the interval for which the turbine speed was throttled
batCurtailment: Windspeed10minLast24Hrs(start: 19:00, end: 21:00)
// Calculate the average windspeed during the curtailment period
batCurtailmentAvg: _avg(Windspeed10minLast24Hrs: {start: 19:00, end: 21:00}),
}
}
Inspired by mention of SunSpec stuff, this task should only be worked on in free time, and mostly for fun. Try not to waste other peoples' time with this (for now at least). Task partly created so I dont forget about this thought. I am also highly likely to hit a number of blockers that we plan on having anyway in defra (which should not be rushed in as part of this).
It could be super fun to have defra support N dimensional array data types, even more fun if they can be accessed via human readable keys (e.g. time), even more fun if they can act as N length buffers, even more fun if the buffers can grow/skrink auto-magically based on certain conditions (e.g. shrink post-sync), and if they can support aggregates (potentially eagerly evaluated).
Look at supporting medium/long-term storage, and real-time transmission buffers.
Might need to have a more serious look at pointproof stuff etc, as it could be very interesting to try and do this efficiently whilst correctly maintaining correct versioning guarantees (would be obscenely expensive to track a commit per item update).