dwyl / postgres-history-spike

📜 our quest to transparently store the history of all records when an update is made
GNU General Public License v2.0
5 stars 0 forks source link

SPIKE: Automatically insert records in address_history table #1

Open nelsonic opened 5 years ago

nelsonic commented 5 years ago

The objective of this quest is to run a script in any project that uses PostgreSQL that will:

3 & 5 are related.

@RobStallion opened StackOverflow Question: https://stackoverflow.com/questions/56295703/how-to-store-table-history-in-postgresql Sadly nobody responded ... it may be worth re-asking it on https://dba.stackexchange.com ... it's also worth breaking it up into 2 (or more) separate questions. 💭

This quest is similar to https://github.com/dwyl/ecto-postgres-pubsub-spike/issues/1 But the key distinction is that the previous spike had:

Relevant SO question: https://stackoverflow.com/questions/38954139/implementing-history-of-postgresql-table

nelsonic commented 5 years ago
SELECT column_name, data_type, character_maximum_length
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE table_catalog = 'cs_guide_dev' AND table_name = 'drinks' 
column_name data_type character_maximum_length
id bigint
name character varying 255
brand_id bigint
abv double precision
weighting integer
description text
entry_id character varying 255
deleted boolean
ingredients text
inserted_at timestamp without time zone
updated_at timestamp without time zone
nelsonic commented 5 years ago

Searched for: "postgres create table based on another table" Found: https://dba.stackexchange.com/questions/207097/create-table-like-another-table-but-with-additional-columns Example given:

CREATE TABLE old_table_name (
  id serial,
  my_data text,
  primary key (id)
);

CREATE TABLE new_table_name ( 
  new_col1 integer, 
  new_col2 text,
  like old_table_name including all
);

Adaptation:

CREATE TABLE address (
  id serial,
  address text,
  city text,
  postcode text,
  primary key (id)
);

CREATE TABLE IF NOT EXISTS address_history ( 
  _id serial, 
  inserted_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  like address
  primary key (_id)
);

This works for creating the table if it does not already exist. But it does not work for mirroring the two tables after the fact. We need some code that checks if the two tables match

nelsonic commented 5 years ago

Get list of tables in database:

SELECT * FROM INFORMATION_SCHEMA.tables
WHERE table_catalog = 'cs_guide_dev'

Or restricting the columns to just the ones we care about:

SELECT table_catalog, table_schema, table_name 
FROM information_schema.tables
WHERE table_catalog = 'cs_guide_dev'
AND table_schema = 'public'
table_catalog table_schema table_name
cs_guide_dev public discount_codes
cs_guide_dev public drink_images
cs_guide_dev public brands
cs_guide_dev public drink_styles
cs_guide_dev public drink_types_drink_styles
cs_guide_dev public schema_migrations
cs_guide_dev public search_log
cs_guide_dev public sponsor
cs_guide_dev public static_pages
cs_guide_dev public venues_venue_types
cs_guide_dev public drinks
cs_guide_dev public drink_types
cs_guide_dev public drinks_drink_styles
cs_guide_dev public drinks_drink_types
cs_guide_dev public venue_images
cs_guide_dev public venues_drinks
cs_guide_dev public users
cs_guide_dev public venue_types
cs_guide_dev public brand_images
cs_guide_dev public venues
cs_guide_dev public venues_users
nelsonic commented 5 years ago

Trying to assign the result of this query to a variable. Following: https://stackoverflow.com/questions/12328198/store-query-result-in-a-variable-using-in-pl-pgsql

name := (
  SELECT table_name 
  FROM information_schema.tables
  WHERE table_catalog = 'cs_guide_dev'
  AND table_schema = 'public'
);
raise notice 'Tables: %', name;

Sadly that results in an error:

ERROR:  syntax error at or near "tables"
LINE 1: name := (
        ^

Tried debugging it for a few minutes and got nothing ...

name := (SELECT d.name from drinks d where t.id = 1);

Still fails ... 😞

Need to figure out how to assign a variable in Postgres ... #HelpWanted

RobStallion commented 5 years ago

https://stackoverflow.com/questions/15714342/iterating-through-postgresql-records-how-to-reference-data-from-next-row

@nelsonic Take a look at this answer. Loops through the results of a select query

nelsonic commented 5 years ago
DO $$
BEGIN
   FOR counter IN 1..5 LOOP
   RAISE NOTICE 'Counter: %', counter;
   END LOOP;
END; $$

Output:

NOTICE:  Counter: 1
NOTICE:  Counter: 2
NOTICE:  Counter: 3
NOTICE:  Counter: 4
NOTICE:  Counter: 5
DO
DO $$

DECLARE
  tables RECORD;

BEGIN
FOR tables IN
  (SELECT table_name 
  FROM information_schema.tables
  WHERE table_catalog = 'cs_guide_dev'
  AND table_schema = 'public')
LOOP
  RAISE NOTICE 'Table: %', tables.table_name;
END LOOP;

END; $$
NOTICE:  Table: discount_codes
NOTICE:  Table: drink_images
NOTICE:  Table: brands
NOTICE:  Table: drink_styles
NOTICE:  Table: drink_types_drink_styles
NOTICE:  Table: schema_migrations
NOTICE:  Table: search_log
NOTICE:  Table: sponsor
NOTICE:  Table: static_pages
NOTICE:  Table: venues_venue_types
NOTICE:  Table: drinks
NOTICE:  Table: drink_types
NOTICE:  Table: drinks_drink_styles
NOTICE:  Table: drinks_drink_types
NOTICE:  Table: venue_images
NOTICE:  Table: venues_drinks
NOTICE:  Table: users
NOTICE:  Table: venue_types
NOTICE:  Table: brand_images
NOTICE:  Table: venues
NOTICE:  Table: venues_users
DO
RobStallion commented 5 years ago

https://stackoverflow.com/questions/36959/how-do-you-use-script-variables-in-psql

nelsonic commented 5 years ago
DO $$

DECLARE
  tables RECORD;
  table_name TEXT;
  new_table TEXT;

BEGIN
FOR tables IN
  (SELECT t.table_name 
  FROM information_schema.tables t
  WHERE t.table_catalog = 'hits_dev'
  AND t.table_schema = 'public')
LOOP
  table_name := tables.table_name; 
  new_table := table_name || '_history';

  RAISE NOTICE 'Table: %, %', table_name, new_table;
  END LOOP;
END; $$

That works and outputs the following:

NOTICE:  Table: schema_migrations, schema_migrations_history
NOTICE:  Table: users, users_history
NOTICE:  Table: repositories, repositories_history
NOTICE:  Table: useragents, useragents_history
NOTICE:  Table: address, address_history
NOTICE:  Table: hits, hits_history
DO

Now we want to add the CREATE statement:

    CREATE TABLE IF NOT EXISTS new_table ( 
      _id serial, 
      inserted_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
      like table_name
  );
RobStallion commented 5 years ago
DO $$

DECLARE
  tables RECORD;
  table_name TEXT;
  new_table TEXT;

BEGIN
FOR tables IN
  (SELECT t.table_name 
  FROM information_schema.tables t
  WHERE t.table_catalog = 'app_dev'
  AND t.table_schema = 'public'
  AND t.table_name not like '%_history'
  AND t.table_name not like 'schema_migrations'
  )
LOOP
  table_name := tables.table_name; 
  new_table := table_name || '_history';

  EXECUTE format('CREATE TABLE IF NOT EXISTS %I (_id serial, like %I)', new_table, table_name);

  END LOOP;
END; $$
nelsonic commented 5 years ago

Make the database name a variable:

DO $$

DECLARE
  db_name TEXT := 'hits_dev';
  tables RECORD;
  table_name TEXT;
  new_table TEXT;

BEGIN

FOR tables IN
  (SELECT t.table_name 
  FROM information_schema.tables t
  WHERE t.table_catalog = format('%I', db_name)
  AND t.table_schema = 'public'
  AND t.table_name not like '%_history'
  AND t.table_name not like 'schema_migrations'
  )
LOOP
  table_name := tables.table_name; 
  new_table := table_name || '_history';

  EXECUTE format('CREATE TABLE IF NOT EXISTS %I (_id serial, like %I)', new_table, table_name);

  END LOOP;
END; $$
nelsonic commented 5 years ago

Make it a function which can be invoked with a db_name argument:

CREATE OR REPLACE FUNCTION create_history(db_name TEXT)

RETURNS boolean 

AS $func$

DECLARE
  tables RECORD;
  table_name TEXT;
  new_table TEXT;

BEGIN

FOR tables IN
  (SELECT t.table_name 
  FROM information_schema.tables t
  WHERE t.table_catalog = format('%I', db_name)
  AND t.table_schema = 'public'
  AND t.table_name not like '%_history'
  AND t.table_name not like 'schema_migrations'
  )
LOOP
  table_name := tables.table_name; 
  new_table := table_name || '_history';

  EXECUTE format('CREATE TABLE IF NOT EXISTS %I (_id serial, like %I)', new_table, table_name);

  END LOOP;

  RETURN true;  -- boolean!
END; $func$

LANGUAGE plpgsql;

Thanks to: https://dba.stackexchange.com/questions/159424/how-to-use-function-parameters-in-dynamic-sql-with-execute

SELECT create_history('hits_dev');

image

We probably don't need this to be a function, but I wanted to check if it was possible. 👍

nelsonic commented 5 years ago

NEXT: Create the Trigger to insert into _history

https://stackoverflow.com/questions/38954139/implementing-history-of-postgresql-table

CREATE TABLE "ps_counters"
(
    "psid" integer NOT NULL,
    "counter" integer NOT NULL,
   "color" TEXT
);

CREATE TABLE "ps_counters_history"
(
  "id" serial PRIMARY KEY,
  "psid" integer NOT NULL,
  "counter" integer NOT NULL,
  "color" TEXT
);

CREATE OR REPLACE FUNCTION ps_counters_history_trigger()
  RETURNS trigger AS
$BODY$
  DECLARE
    table_name text;
  BEGIN
    table_name := 'ps_counters_history_' || to_char(CURRENT_DATE, 'yyyy_mm');
    IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = table_name)
    THEN
      EXECUTE 'CREATE TABLE IF NOT EXISTS ' || table_name ||
              ' () INHERITS (ps_counters_history);';
    END IF;
    EXECUTE 'INSERT INTO ' || table_name ||
            '(psid, counter) VALUES ($1.psid, $1.counter);' USING NEW;
    RETURN NEW;
  END
$BODY$
  LANGUAGE plpgsql;

CREATE TRIGGER ps_counters_history_trigger
AFTER INSERT OR UPDATE ON ps_counters FOR EACH ROW
EXECUTE PROCEDURE ps_counters_history_trigger();

Now insert new data into ps_counters and then check ps_counters_history:

insert into ps_counters(psid, counter) VALUES (1, 2)

We want to transform:

 '(psid, counter) VALUES ($1.psid, $1.counter);' USING NEW;

Into a SELECT that gets all the column names

DO $$

DECLARE
   i RECORD;
   str text := '';
BEGIN

FOR i IN
  SELECT column_name
  FROM INFORMATION_SCHEMA.COLUMNS 
  WHERE table_catalog = 'hits_dev' AND table_name = 'ps_counters' 
LOOP

  IF str = '' THEN
    str := i.column_name;
  ELSE
    str :=  str || ',' || i.column_name;
  END IF;

END LOOP;

  RAISE NOTICE 'str: %', str;

END; $$

image

Make it a function:

DROP FUNCTION get_column_names(text,text);

CREATE OR REPLACE FUNCTION get_column_names(db_name TEXT, t_name TEXT)
RETURNS TEXT

AS $func$

DECLARE
   i RECORD;
   str text := '';
BEGIN

FOR i IN
  SELECT column_name
  FROM INFORMATION_SCHEMA.COLUMNS 
  WHERE table_catalog = format('%I', db_name) 
  AND table_name = format('%I', t_name) 
LOOP

  IF str = '' THEN
    str := i.column_name;
  ELSE
    str :=  str || ',' || i.column_name;
  END IF;

END LOOP;

  RAISE NOTICE 'str: %', str;

  RETURN '(' || str || ')';
END; $func$

LANGUAGE plpgsql;

Now try to invoke the get_column_names function:

SELECT get_column_names('hits_dev', 'ps_counters');

image

Try debugging this: image

🤣 Good thing we are pairing on this ...

DROP TRIGGER ps_counters_history_trigger ON ps_counters;

CREATE OR REPLACE FUNCTION ps_counters_history_trigger()
  RETURNS trigger AS
$BODY$
  DECLARE
    table_name text;
  BEGIN
    table_name := 'ps_counters_history_' || to_char(CURRENT_DATE, 'yyyy_mm');
    IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = table_name)
    THEN
      EXECUTE 'CREATE TABLE IF NOT EXISTS ' || table_name ||
              ' () INHERITS (ps_counters_history);';
    END IF;
    EXECUTE 'INSERT INTO ' || table_name ||
            '(' || (SELECT get_column_names('hits_dev', 'ps_counters'))  || ')' || 
            'VALUES ($1.psid, $1.counter, $1.color);' USING NEW;
    RETURN NEW;
  END
$BODY$
  LANGUAGE plpgsql;

CREATE TRIGGER ps_counters_history_trigger
AFTER INSERT OR UPDATE ON ps_counters FOR EACH ROW
EXECUTE PROCEDURE ps_counters_history_trigger();

NEXT: replace VALUES with a similar function

the line

 'VALUES ($1.psid, $1.counter, $1.color);' USING NEW;

Needs to be replaced by a similar function.

nelsonic commented 5 years ago

This is why we write our queries in Issue comments before transferring them to Postico: image

DROP FUNCTION get_column_names_values(text,text);

CREATE OR REPLACE FUNCTION get_column_names_values(db_name TEXT, t_name TEXT)
RETURNS TEXT

AS $func$

DECLARE
   i RECORD;
   str text := '';
BEGIN

FOR i IN
  SELECT column_name
  FROM INFORMATION_SCHEMA.COLUMNS 
  WHERE table_catalog = format('%I', db_name) 
  AND table_name = format('%I', t_name) 
LOOP

  IF str = '' THEN
    str := '$1.' || i.column_name;
  ELSE
    str :=  str || ', $1.' || i.column_name;
  END IF;

END LOOP;

  RAISE NOTICE 'str: %', str;

  RETURN '(' || str || ')';
END; $func$

LANGUAGE plpgsql;
DROP TRIGGER ps_counters_history_trigger ON ps_counters;

CREATE OR REPLACE FUNCTION ps_counters_history_trigger()
  RETURNS trigger AS
$BODY$
  DECLARE
    table_name text;
  BEGIN
    table_name := 'ps_counters_history_' || to_char(CURRENT_DATE, 'yyyy_mm');
    IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = table_name)
    THEN
      EXECUTE 'CREATE TABLE IF NOT EXISTS ' || table_name ||
              ' () INHERITS (ps_counters_history);';
    END IF;
    EXECUTE 'INSERT INTO ' || table_name ||
            (SELECT get_column_names('hits_dev', 'ps_counters'))  || 
            'VALUES' || (SELECT get_column_names_values('hits_dev', 'ps_counters')) || ';' USING NEW;
    RETURN NEW;
  END
$BODY$
  LANGUAGE plpgsql;

CREATE TRIGGER ps_counters_history_trigger
AFTER INSERT OR UPDATE ON ps_counters FOR EACH ROW
EXECUTE PROCEDURE ps_counters_history_trigger();
nelsonic commented 5 years ago

NEXT: Put It All Together! 💥

See: #2

nelsonic commented 5 years ago
EXECUTE format('CREATE TABLE IF NOT EXISTS %I
      (_id serial PRIMARY KEY, like %I)', new_table, table_name);