Police-Data-Accessibility-Project / data-sources-app

An API and UI for using and maintaining the Data Sources database
MIT License
4 stars 5 forks source link

Create views for complex queries #410

Closed maxachis closed 1 week ago

maxachis commented 3 months ago

Context

Some complex queries (looking at you, Quick Search logic) include a wide number of joins and other complex logic, which are sometimes repeated multiple times in the backend code. For example:

QUICK_SEARCH_SQL = """
    SELECT
        data_sources.airtable_uid,
        data_sources.name AS data_source_name,
        data_sources.description,
        data_sources.record_type,
        data_sources.source_url,
        data_sources.record_format,
        data_sources.coverage_start,
        data_sources.coverage_end,
        data_sources.agency_supplied,
        agencies.name AS agency_name,
        agencies.municipality,
        agencies.state_iso
    FROM
        agency_source_link
    INNER JOIN
        data_sources ON agency_source_link.airtable_uid = data_sources.airtable_uid
    INNER JOIN
        agencies ON agency_source_link.agency_described_linked_uid = agencies.airtable_uid
    INNER JOIN
        state_names ON agencies.state_iso = state_names.state_iso
    WHERE
        (data_sources.name ILIKE '%{0}%' OR data_sources.description ILIKE '%{0}%' OR data_sources.record_type ILIKE '%{0}%' OR data_sources.tags ILIKE '%{0}%') 
        AND (agencies.county_name ILIKE '%{1}%' OR substr(agencies.county_name,3,length(agencies.county_name)-4) || ' County' ILIKE '%{1}%' 
            OR agencies.state_iso ILIKE '%{1}%' OR agencies.municipality ILIKE '%{1}%' OR agencies.agency_type ILIKE '%{1}%' OR agencies.jurisdiction_type ILIKE '%{1}%' 
            OR agencies.name ILIKE '%{1}%' OR state_names.state_name ILIKE '%{1}%')
        AND data_sources.approval_status = 'approved'
        AND data_sources.url_status not in ('broken', 'none found')

"""

It may be useful to migrate some of this code to the database, and then have simpler queries in the app that simply query the relevant view.

Requirements

Tests

Docs

Open questions

josh-chamberlain commented 3 months ago

quick search is more or less deprecated in v2, but there may be other examples that would benefit. I'm all in favor of using views to normalize how we grab information from the database, and keep it DRY; however, this is partially how I conceive of the database client—isn't it partially to prevent repeating ourselves on the back end? Is this just a different thing?

maxachis commented 3 months ago

@josh-chamberlain A valid point! And to some degree I concede it's a matter of preference, and we can certainly consolidate some of the backend logic without resorting to views. That being said, I think there are a few things which favor views:

  1. If we want to extend our dynamic role-based column access logic to complex joins of multiple tables, then we can consolidate those joins into a view which we can then define variable access permissions for. Indeed, much of my new backend logic is designed to accommodate variable column access for tables or views.
  2. Views will allow us to more neatly organize some common queries. To give an example, if we keep the query in the database, it will exist as a string constant alongside other data in one of several modules. By contrast, if we keep the query in a view, we can easily peruse and run them in PgAdmin.
  3. As more of the db client logic migrates to dynamic query generation (and possibly SQL Alchemy), it's easier to run and design such methods if they are performed on simple relations (including views), rather than complex joins of multiple relations.

Granted, the flip-side of this is that queries are somewhat less visible -- you'll need to be able to view them in the database rather than in a repo, and version control suffers as a result. So that's worth taking into account as well.

josh-chamberlain commented 3 months ago

Gotcha, thanks @maxachis

  1. nice. complex joins are probably the golden-est use of this. we love a view for that.
  2. true!
  3. true!

Just to help keep my mental model accurate, I made an ADR here talking about the differences. Someday, we may want a little doc somewhere for explaining which views we're using + for what.

maxachis commented 1 week ago

I'm gonna go ahead and marked this one closed, as while there are 1-2 queries which could be viewified, by and large, complex logic is increasingly present in the database!

josh-chamberlain commented 1 week ago

@maxachis does this imply that SQLAlchemy obviates the need for views, or something else?

maxachis commented 1 week ago

@josh-chamberlain This means that we have, over the course of making various refactors and in some cases SQLAlchemy changes, added various database views to the logic, so we've already resolved this in the background! Not everything is viewified, but by and large complex logic is moved to views.