supabase / supa_audit

Generic Table Auditing
Apache License 2.0
649 stars 38 forks source link

Docs for common use cases #14

Open idmitrievsky opened 2 years ago

idmitrievsky commented 2 years ago

Hey, I find this extension really useful 🙏

I am not sure about how some of the use cases should be approached with such a table.

Getting the history of a particular row

If I I want to get the history of a particular row from the primary table, should I query on the primary key inside of JSONB record?

Getting the most recent version of a particular row

I think that if I can get the history then the row with the latest ts is the most recent. So no question here 😄

Getting the previous record from the most recent record

It got a little bit confusing here. I thought that record_id would be unique for every record (we could hash record contents and ts to achieve uniqueness). With that in mind we could find the previous record by looking at old_record_id of the record (a linked list essentially).

But that is not how it works 🤔 The old_record_id and record_id are always the same, because they're based on the primary keys which don't change for the tracked record. I am not sure I understand the purpose of having to identical ids for each record... And I have hard time understanding the purpose of record_id itself – I don't know record_id beforehand to select based on its value. I have to query JSONB for the primary key that I know, and while at that I can just select all rows that match (and they will all have the same record_id).

I think I'm missing something that prevents me from using this extension correctly, so I would really appreciate your answer!

olirice commented 2 years ago

I am not sure about how some of the use cases should be approached with such a table.

Good point. Yeah, the README is pretty lacking for how to query the data.

I'll answer your specific Qs here and have made a note to update the README once I have a chance to bring in a couple of helper utilities to make it easier to query.

Getting the history of a particular row

Given a setup like

  create table public.members(
      id int primary key,
      name text not null
  );

  select audit.enable_tracking('public.members');

  insert into public.members(id, name)
  values (1, 'foo');

  update public.members
      set name = 'bar'
      where id = 1;

you could query for the history of public.members for the row with id=1 using

  select
      rv.* 
  from
      public.members mem
      join audit.record_version rv
          on audit.to_record_id(
              'public.members'::regclass,
              audit.primary_key_columns('public.members'::regclass),
              to_json(mem)::jsonb
          ) = rv.record_id
  where
      mem.id = 1;
  order by
      rv.ts asc

returns

 id |              record_id               |            old_record_id             |   op   |                 ts                 | table_oid | table_schema | table_name |          record          |        old_record        
----+--------------------------------------+--------------------------------------+--------+------------------------------------+-----------+--------------+------------+--------------------------+--------------------------
  2 | 2f3ff181-f226-5566-8edc-fb11e907c122 |                                      | INSERT | Thu Mar 10 21:41:48.88712 2022 PST |     16525 | public       | members    | {"id": 1, "name": "foo"} | 
  3 | 2f3ff181-f226-5566-8edc-fb11e907c122 | 2f3ff181-f226-5566-8edc-fb11e907c122 | UPDATE | Thu Mar 10 21:41:48.88712 2022 PST |     16525 | public       | members    | {"id": 1, "name": "bar"} | {"id": 1, "name": "foo"}
(2 rows)

I thought that record_id would be unique for every record

record_id is unique for each combination of "table" and "table-pkey"

old_record_id and record_id are always the same

Almost always 😄. Theres nothing that stops a user from updating a records primary key. In that situation, they would differ.

With that in mind we could find the previous record by looking at old_record_id of the record (a linked list essentially).

Good analogy. You could do that, but queries would be slower and more difficult to write (recursive CTE)

idmitrievsky commented 2 years ago

Thanks for the examples, it's a lot more clear now 🙏

I use Postgres in pretty vanilla apps, so I haven't even thought about changing primary keys 🤯 So my point about being able to trace a changing record through its primary key inside of JSONB becomes invalid at this point and that's when old_record_id comes in.

As for your example query – do I understand correctly that it will return a portion of the history only with the same primary key? And from there I can get the rest by following old_record_id?

And regarding "getting the previous record from the most recent record" the answer is to just use ts and get the first older record with the same record_id?

It's great that you have some querying helpers in mind – looking forward to the updates!

olirice commented 2 years ago

do I understand correctly that it will return a portion of the history only with the same primary key? And from there I can get the rest by following old_record_id?

yes, that's right

And regarding "getting the previous record from the most recent record" the answer is to just use ts and get the first older record with the same record_id?

yep. the ts is the timestamp of the commit so it will be linear unless the same record is edited multiple times in a single transaction

idmitrievsky commented 2 years ago

I also want to let you know about the case when the row is deleted. After that the query you suggested:

  select
      rv.* 
  from
      public.members mem
      join audit.record_version rv
          on audit.to_record_id(
              'public.members'::regclass,
              audit.primary_key_columns('public.members'::regclass),
              to_json(mem)::jsonb
          ) = rv.record_id
  where
      mem.id = 1;
  order by
      rv.ts asc

no longer works, because there is no row in members table to join against and no row to pass into to_json(...)::jsonb to evaluate audit.to_record_id. That's very logical, of course, but at this point the only way to find a specific audit row is to query the record contents.

This can be left as is or maybe it makes sense to always create indexes into record primary keys within enable_tracking – I'm not sure.

olirice commented 2 years ago

thanks, if you need that right away, you can do it like this

  select
      rv.* 
  from
      audit.record_version rv
  where
      rv.record_id = audit.to_record_id(
          'public.members'::regclass,
          '["id"]'::jsonb,
          '{"id": 1}'::jsonb
      )
  order by
      rv.ts asc
github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 30 days with no activity.

MichaelDBA commented 2 years ago

yes, would be great to add some of this kind of knowledge to the docs

overlinejota commented 2 years ago

select rv.* from audit.record_version rv where rv.record_id = audit.to_record_id( 'public.members'::regclass, ('{"id"}'), '{"id": 1}'::jsonb ) order by rv.ts asc;

-- OR

select rv.* from audit.record_version rv where rv.record_id = audit.to_record_id( 'public.members'::regclass, audit.primary_key_columns('public.members'::regclass), '{"id": 1}'::jsonb ) order by rv.ts asc;

Work in my PostgreSQL 14

bobtnt commented 1 year ago

None of the above queries would show DELETE. The script below will show DELETE, INSERT, UPDATE.

select rv.* from audit.record_version rv where rv.old_record_id = audit.to_record_id( 'public.account'::regclass, ('{"id"}'), '{"id": 1}'::jsonb ) or rv.record_id = audit.to_record_id( 'public.account'::regclass, ('{"id"}'), '{"id": 1}'::jsonb ) order by rv.ts asc