Timeseries are core data for the aggregation domain.
In this feature, a solution of how to publish timeseries to aggregation domain must be designed and implemented, so whenever time-series enter the time-series domain, these will be used in a calculation job.
The following flow must be supported, when this feature is finished
Benefit Hypothesis
Timeseries are the value to the aggregation and wholesale jobs. Without them we could not perform settlement.
Acceptance criteria and must have scope
[ ] Given a meter data responsible submits a CIM-based RSM-012, when datahub completes an aggregation (D03) process, Then RSM-014 is available for the meter data responsible, through message hub.
[ ] Given an aggregation process has completed, When examining the generated RSM-014, Then energy sums per grid area is available in the message.
[ ] The design must support future performance requirements which is 2,5 mil timeseries/hour of 24/96 positions each
[ ] Given metering points has status connected, when triggering a job, then they are included
[ ] Given multiple time series are submitted for the same period for the same metering point, Then the most recent time series is used in aggregation job.
[ ] The described flow is working on B-002.
[x] We should be able to receive bundled timeseries messages.
Out of scope
Business validation on timeseries
Integration events from metering point to timeseries domain
Balance fixing process and locking of periods
Tech note:
[ ] Triggering of process is done through postman or related tool
We need to be able to get notified if a streaming job is running or is stopped. (surveailance) health / ops
What to do if streaming job fails and stops?
Retry logic is applicable for streaming jobs - LKI 25-01-2022
What happens to the received time series that is added to event hub?
The events on the event hub should still be available on event hub until consumed by streaming job with a retention of up to seven days if standard tier is selected for event hub, reference link. - LKI 25-01-2022
Another option to allow for almost indefinite event retention is to use Azure Event Hubs Capture (link1, link2), which essentially consumes events from Event Hub and stores them in a storage account or data lake in avro format. - LKI 25-01-2022
If opting for Azure Event Hubs Capture, an option is to use Databricks Auto Loader to stream events from data lake into delta lake on new files detected. This can be done using Avro as file format. Read more on how to configure Auto Loader. - LKI 25-01-2022
How can we restart the streaming job and receive the queued up time series on the event hub?
It is possible to specify a checkpoint location for a streaming job, which holds information on which events have been processed. - LKI 25-01-2022
Performance tests to document how much data we can handle. (needs metrics defined)
Non Functional Requirements
[ ] All inbound and outbound messages are logged
[ ] Developers and business are confident that the described flow can work in actor test.
Stakeholders
Khatozen
Irene
Volt
Note:
Actor register / Authorization (Not in scope)
Performance (The design must support future performance requirements which is 2,5 mil/hour of 24/96 positions each)
Problem Description
Timeseries are core data for the aggregation domain.
In this feature, a solution of how to publish timeseries to aggregation domain must be designed and implemented, so whenever time-series enter the time-series domain, these will be used in a calculation job.
The following flow must be supported, when this feature is finished
Benefit Hypothesis
Timeseries are the value to the aggregation and wholesale jobs. Without them we could not perform settlement.
Acceptance criteria and must have scope
Out of scope
Tech note:
Non Functional Requirements
Stakeholders
Khatozen Irene Volt
Note:
Actor register / Authorization(Not in scope)