Energinet-DataHub / ARCHIVED-geh-charges

Apache License 2.0
7 stars 3 forks source link

Spike: Propose a solution for persisting and delivering charge history data to frontend #1841

Closed prtandrup closed 1 year ago

prtandrup commented 1 year ago

As a Volt Developer, I want an overview of the technology we need for persisting and delivering charge history data (information and prices) So that its' clear to everyone and we can start delivering business value

With this spike, we want to dive into different options and decide the technology stack we need to persist and deliver charge history data to the frontend

Considerations

Acceptance criteria

Work-in-progress

Requirements for historic charge data

Technology choice

HenrikSommer commented 1 year ago

In the spring of 2021 worked on another spike, with a similar purpose. That spike was dedicated to shed light on the possibilities and limitations of using the built-in Change Data Capture (CDC) feature of SQL-server: Spike: Fetch historical data from SQL using CDC The conclusion then, was that CDC did not fullfil our requirements, mainly due to the lack of support for refactoring: How model changes impact CDC

More relevant information regarding CDC:

Now, System-versioned temporal tables (Temporal Tables) could be a more suitable solution to the requirement we have identified, so far. This blog post gives valuable insights into the possibilities and limitations of Temporal Tables, including a comparison with CDC, showing that Temporal tables clearly are more suited for our use. It does, however, not show how Temporal tables cope with model refactorings.

HenrikSommer commented 1 year ago

More related work: issue #628: Enabler: Prepare for quering historical data (CQRS refactoring)

PR's related to issue #628, including the PR where the old charge history model was removed

x-platformcoder commented 1 year ago

Some Sql to demonstrate the use of Temporal tables: Creates a table and make several operations on both the datamodel and making CRUD opreations on the table:

x-platformcoder commented 1 year ago

IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Employee')) BEGIN -- DROP TABLE WTH VERSIONING ALTER TABLE [dbo].[Employee] SET ( SYSTEM_VERSIONING = OFF) DROP TABLE [dbo].[Employee] DROP TABLE [dbo].[EmployeeHistory] END GO

-- CREATE TABLE WITH VERSIONING CREATE TABLE dbo.Employee ( [EmployeeID] int NOT NULL PRIMARY KEY CLUSTERED , [Name] nvarchar(100) NOT NULL , [Position] varchar(100) NOT NULL , [Department] varchar(100) NOT NULL , [Address] nvarchar(1024) NOT NULL , [AnnualSalary] decimal (10,2) NOT NULL , [ValidFrom] datetime2 GENERATED ALWAYS AS ROW START HIDDEN , [ValidTo] datetime2 GENERATED ALWAYS AS ROW END HIDDEN , PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory));

-- ADD COLUMN WITH 'NOT NULL' COLUMN ALTER TABLE dbo.Employee ADD NewColumn varchar(100) NOT NULL CONSTRAINT DF_NewColumn DEFAULT 'default value'

-- INSERT AND UPDATES SOME DATA INSERT INTO Employee (EmployeeId, Name, Position, Department, Address, AnnualSalary) values(0, 'Jan', 'Developer', 'Datahub', 'Vesterballevej 4', 1000)

UPDATE Employee SET AnnualSalary = 2000

UPDATE Employee set AnnualSalary = 3000

UPDATE Employee set Name = 'Idiot@work' where EmployeeID = 0

-- CREATE SOME COLUMNS ALTER TABLE dbo.Employee ADD ToBeDeleted varchar(100)

ALTER TABLE dbo.Employee ADD NewColumn2 varchar(100)

-- POPULATE WITH DATA INSERT INTO Employee (EmployeeId, Name, Position, Department, Address, AnnualSalary, ToBeDeleted, NewColumn2 ) values(1, 'Jan Duelund', 'Developer', 'Datahub', 'Vesterballevej', 1500, 'Initial data to be deleted', 'Initial data') UPDATE Employee set NewColumn2 = 'Updated data', ToBeDeleted = 'Delete me' WHERE EmployeeId = 1

-- ALTER COLUMN TO 'NOT NULL' ALTER TABLE dbo.Employee SET (SYSTEM_VERSIONING = OFF)

UPDATE EmployeeHistory SET NewColumn2 = '' WHERE NewColumn2 IS NULL UPDATE Employee SET NewColumn2 = '' WHERE NewColumn2 IS NULL

ALTER TABLE dbo.Employee SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory));

ALTER TABLE dbo.Employee ALTER COLUMN NewColumn2 varchar(100) NOT NULL GO

-- SHOW CURRENT DATA (BEFORE DROPPING COLUMN) SELECT FROM Employee SELECT FROM EmployeeHistory

-- DROP A COLUMN ALTER TABLE dbo.Employee SET (SYSTEM_VERSIONING = OFF) ALTER TABLE dbo.EmployeeHistory DROP COLUMN ToBeDeleted ALTER TABLE dbo.Employee DROP COLUMN ToBeDeleted ALTER TABLE dbo.Employee SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory)); GO

-- TRY TO MAKE AN UPDATE WITHOUT CHANGES (NO HISTORICAL ENTRY IS MADE) UPDATE Employee SET Position = 'Developer' WHERE EmployeeId = 1

-- SHOW CURRENT DATA SELECT FROM Employee SELECT FROM EmployeeHistory

-- DELETE A ROW DELETE FROM EMPLOYEE WHERE EmployeeId = 1

-- SHOW CURRENT DATA SELECT * FROM Employee

-- SHOW CURRENT DATA IN A SPECIFIC TIME USING FOR SYSTEM_TIME SELECT * FROM Employee FOR SYSTEM_TIME BETWEEN '2019-01-01 00:00:00.0000000' AND '2024-12-31 00:00:00.00' ORDER BY ValidFrom;

x-platformcoder commented 1 year ago

Temporal history is only recorded when and actual change has been made, so if this is a needed feature an extra field can be added "RowVersionId" to the table that will automatically be updated by the Sql server on every update without any further code and therefore trigger a history logging: (Add the rowversion column to the above "create table" and everything is logged/recorded)

-- CREATE TABLE WITH VERSIONING AND ROWVERSION CREATE TABLE dbo.Employee ( [EmployeeID] int NOT NULL PRIMARY KEY CLUSTERED , [Name] nvarchar(100) NOT NULL , [Position] varchar(100) NOT NULL , [Department] varchar(100) NOT NULL , [Address] nvarchar(1024) NOT NULL , [AnnualSalary] decimal (10,2) NOT NULL , [RowVersionId] rowversion NOT NULL , [ValidFrom] datetime2 GENERATED ALWAYS AS ROW START HIDDEN , [ValidTo] datetime2 GENERATED ALWAYS AS ROW END HIDDEN , PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory));

HenrikSommer commented 1 year ago

Basic knowledge of Cosmos DB

HenrikSommer commented 1 year ago

I did som spike work with Temporal tables in Charges solution. Main conclusion is that it is possible to use, but would intervene our solution design in an unwantet manner:

All-in-all, my recommendation is to not use Temporal Tables, and go with a custom solution based on servicebus, a history function and custom history table(s).

prtandrup commented 1 year ago

In my opinion. the AC's are fulfilled. The work continues with #1846