PostgREST / postgrest

REST API for any Postgres database
https://postgrest.org
MIT License
23.42k stars 1.03k forks source link

Embedding views with sub-selects #1608

Closed wolfgangwalther closed 3 years ago

wolfgangwalther commented 4 years ago

When embedding views, the following breaks the detection algorithm:

-- this works
CREATE VIEW german_films AS
SELECT *,
       (SELECT 'hello') AS world
  FROM films
 WHERE language='de';

-- while this does not work
CREATE VIEW german_films AS
SELECT *,
       COALESCE((SELECT 'hello'), 'bye') AS world
  FROM films
 WHERE language='de';

What happens here is:

My current solution is to wrap another SELECT around the COALESCE - that works fine.

I have played a bit with changing up the query in allSourceColumns - I think the structure of pg_node_tree does not allow a proper regex solution to really get rid of all sub-selects. However, I was able to apply a couple of transformations to the node tree to bring it to JSON format - which can easily be parsed for the embeddable columns. This was only tested with PG12 so far and also not very extensively, yet. (just noted that it fails on some of the pg_catalog views apparently - so there is definitely some more tweaking needed for the transform_json part)

The query currently looks like this:


with
views as (
select
   n.nspname   as view_schema,
   c.relname   as view_name,
   r.ev_action as view_definition
from pg_class c
join pg_namespace n on n.oid = c.relnamespace
join pg_rewrite r on r.ev_class = c.oid
where c.relkind in ('v', 'm') and n.nspname = 'public' -- replace in haskell with ANY ($1)
),
transform_json as (
   select
     view_schema,
     view_name,
     replace(
      regexp_replace(
       regexp_replace(
        regexp_replace(
         replace(
          replace(
           replace(
            replace(
             view_definition,
            '({', '[{'),
           '})', '}]'),
          '} {', '},{'),
         '"', '\"'),
        '{([A-Z]+)', '{"node":"\1"', 'g'),
       ':([A-Za-z_]+) ([^[{][^:}]*)( |})', ',"\1":"\2"\3', 'g'),
      ':([A-Za-z_]+) ', ',"\1":', 'g'),
     '\ ', ' ')::jsonb AS view_definition
   from views
),
target_lists as (
   select
     view_schema,
     view_name,
     view_definition->0->'targetList' AS target_list
   from transform_json
),
target_entries as (
   select
     view_schema,
     view_name,
     jsonb_array_elements(target_list) as target_entry
   from target_lists
   where jsonb_typeof(target_list) = 'array'
),
results as (
   select
     view_schema,
     view_name,
     target_entry->>'resname' AS view_column_name,
     target_entry->>'resorigtbl' AS resorigtbl,
     target_entry->>'resorigcol' AS resorigcol
   from target_entries
)
select
   sch.nspname as table_schema,
   tbl.relname as table_name,
   col.attname as table_column_name,
   res.view_schema,
   res.view_name,
   res.view_column_name
from results res
join pg_class tbl on tbl.oid::text = res.resorigtbl
join pg_attribute col on col.attrelid = tbl.oid and col.attnum::text = res.resorigcol
join pg_namespace sch on sch.oid = tbl.relnamespace
where resorigtbl <> '0'
order by view_schema, view_name, view_column_name;

In theory this should allow a much more robust parsing of the view definition, but ONLY IF the json transformation works flawlessly.

Just thought I post that here, in case anyone wants to play with this, before I continue to.

monacoremo commented 4 years ago

This looks super interesting, converting to JSON and leveraging the Postgres JSON parser is a very smart approach!

The tricky part in this query is to get the potentially nested TARGETENTRYs right, converting to JSON solves that beautifully. I was playing with a fast but partially incorrect alternative solution here (on my megaquery2 branch). Your approach seems much more promising.

Some notes: