Police-Data-Accessibility-Project / data-sources-app

An API and UI for using and maintaining the Data Sources database
MIT License
4 stars 5 forks source link

Update schema for `agencies` table #426

Closed maxachis closed 2 weeks ago

maxachis commented 3 months ago

Context

The agencies table should likely be updated to conform to:

  1. Us no longer using airtable
  2. A variety of best practices for database design

Requirements

Possible columns to convert

Columns to rename

Columns to replace:

Columns to Alter:

Columns to remove without replacement

Tests

Docs

Open questions

maxachis commented 3 months ago

@josh-chamberlain Let me know if these schema updates make sense and/or what modifications you'd like to see.

maxachis commented 3 months ago

@josh-chamberlain Additionally, I note that one breaking change would occur with these changes, although the breaking change is an odd one.

If you'll recall, in the primary branch we created an endpoint for caching url search results. The middleware for that affects a table in the database called agency_url_search_cache, which has the agency.airtable_uid as a foreign key. Now if we replace that foreign key with one pointing to agency.id, the script (singular) which used airtable_uid would break until it's modified to reference id.

Note that this is not actually a problem right away, because this is all for the v2 version, and the HomepageSearchCache endpoint exists only in the v1 version. But once we eventually migrate all the code to v2, that'll be an issue. So we could actually just solve this now but make an issue to address that later (and migrate the HomepageSearchCache to v2).

josh-chamberlain commented 3 months ago

@maxachis I edited your initial comment slightly to keep the "last modified" concept. I think this is pretty firmly part of #32 because schema is supposed to flow downstream from Airtable, would you agree? Is there pressing reason to make these changes before that issue?

maxachis commented 3 months ago

@josh-chamberlain

@maxachis I edited your initial comment slightly to keep the "last modified" concept. I think this is pretty firmly part of #32 because schema is supposed to flow downstream from Airtable, would you agree?

Potentially! Honestly, it maybe makes sense for the Airtable replacement and the schema design to inform each other! We may not be able to get a good sense of what the database should be like until we know how it looks working with Retool; conversely, maybe how we design Retool will be informed by the schema changes.

maxachis commented 3 months ago

As a note for future Max, here are some of the tentative scripts I would be using in this schema update (And which would likely inform other schema updates)


-- Make state_iso a foreign key
ALTER TABLE public.Agencies
ADD CONSTRAINT agencies_state_iso_fkey FOREIGN KEY (state_iso)
REFERENCES public.state_names (state_iso) MATCH SIMPLE;

-- Add the new autogenerated `id` column
ALTER TABLE public.Agencies
ADD COLUMN id SERIAL;

ALTER TABLE public.Agencies
ADD CONSTRAINT agencies_id_unique UNIQUE (id);

-- TODO: Update other foreign keys that reference `airtable_uid` to reference `id`
ALTER TABLE public.agency_source_link
ADD COLUMN agency_id integer;

UPDATE public.agency_source_link LINK
SET agency_id = a.id
FROM public.agencies a
WHERE LINK.agency_described_linked_uid = a.airtable_uid;

ALTER TABLE public.agency_source_link
ADD CONSTRAINT agency_source_link_agency_id_fkey FOREIGN KEY (agency_id)
REFERENCES public.Agencies (id) MATCH SIMPLE;

ALTER TABLE public.agency_source_link
DROP COLUMN agency_described_linked_uid;

-- agency_url_search_cache

-- TBD

-- THEN drop primary key

ALTER TABLE public.Agencies
DROP CONSTRAINT agencies_pkey;
maxachis commented 2 weeks ago

We ended up already taking care of a lot of this in other issues, so this one can be counted as completed!