starburstdata / metabase-driver

Starburst Metabase driver
Apache License 2.0
65 stars 10 forks source link

Driver fires queries twice on save #71

Closed paoliniluis closed 1 year ago

paoliniluis commented 1 year ago

To reproduce: 1) Connect Metabase with a postgres db 2 times, first one directly (without Trino in the middle), second one via Trino (as a catalog) 2) Create a SQL question (e.g. with the sample dataset): select from "public"."people" and save it without running 3) now edit the SQL question (e.g. select "public"."people". from "public"."people") and see the difference:

If connecting Metabase with a postgres directly, Metabase will open 2 cursors and fire the query once. This is the log:

2023-02-14 12:59:01.149 UTC [35816] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2023-02-14 12:59:41.637 UTC [35816] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
2023-02-14 12:59:41.637 UTC [35816] LOG:  execute <unnamed>: BEGIN READ ONLY
2023-02-14 12:59:41.637 UTC [35816] LOG:  execute <unnamed>/C_11: -- Metabase:: userID: 1 queryType: native queryHash: 509edd13745b85340d4631ddc5825aedc2c622f3844690176945c47bb8b2eff8
    select people.* from people
2023-02-14 12:59:41.641 UTC [35816] LOG:  execute S_2: ROLLBACK
2023-02-14 12:59:41.641 UTC [35816] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED
2023-02-14 12:59:41.846 UTC [35816] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
2023-02-14 12:59:41.846 UTC [35816] LOG:  execute <unnamed>: BEGIN READ ONLY
2023-02-14 12:59:41.847 UTC [35816] LOG:  execute <unnamed>/C_12: -- Metabase:: userID: 1 queryType: native queryHash: d499de6f09fb9e4aab3045369dca49286961b375cf905f1c69d7ef0a5624fd72
    select people.* from people
2023-02-14 12:59:41.963 UTC [35816] LOG:  execute fetch from <unnamed>/C_12: -- Metabase:: userID: 1 queryType: native queryHash: d499de6f09fb9e4aab3045369dca49286961b375cf905f1c69d7ef0a5624fd72
    select people.* from people
2023-02-14 12:59:42.066 UTC [35816] LOG:  execute fetch from <unnamed>/C_12: -- Metabase:: userID: 1 queryType: native queryHash: d499de6f09fb9e4aab3045369dca49286961b375cf905f1c69d7ef0a5624fd72
    select people.* from people
2023-02-14 12:59:42.167 UTC [35816] LOG:  execute fetch from <unnamed>/C_12: -- Metabase:: userID: 1 queryType: native queryHash: d499de6f09fb9e4aab3045369dca49286961b375cf905f1c69d7ef0a5624fd72
    select people.* from people
2023-02-14 12:59:42.297 UTC [35816] LOG:  execute S_2: ROLLBACK
2023-02-14 12:59:42.297 UTC [35816] LOG:  execute <unnamed>: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED

But if you do the same via Trino, you'll see the following log lines: Step 2: Trino is syncing the table schema, creating the cursor AND running the question, even though in Metabase

2023-02-14 13:12:34.051 UTC [37165] LOG:  execute <unnamed>: SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME,  CASE n.nspname ~ '^pg_' OR n.nspname = 'information_schema'  WHEN true THEN CASE  WHEN n.nspname = 'pg_catalog' OR n.nspname = 'information_schema' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TABLE'   WHEN 'v' THEN 'SYSTEM VIEW'   WHEN 'i' THEN 'SYSTEM INDEX'   ELSE NULL   END  WHEN n.nspname = 'pg_toast' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TOAST TABLE'   WHEN 'i' THEN 'SYSTEM TOAST INDEX'   ELSE NULL   END  ELSE CASE c.relkind   WHEN 'r' THEN 'TEMPORARY TABLE'   WHEN 'p' THEN 'TEMPORARY TABLE'   WHEN 'i' THEN 'TEMPORARY INDEX'   WHEN 'S' THEN 'TEMPORARY SEQUENCE'   WHEN 'v' THEN 'TEMPORARY VIEW'   ELSE NULL   END  END  WHEN false THEN CASE c.relkind  WHEN 'r' THEN 'TABLE'  WHEN 'p' THEN 'PARTITIONED TABLE'  WHEN 'i' THEN 'INDEX'  WHEN 'P' then 'PARTITIONED INDEX'  WHEN 'S' THEN 'SEQUENCE'  WHEN 'v' THEN 'VIEW'  WHEN 'c' THEN 'TYPE'  WHEN 'f' THEN 'FOREIGN TABLE'  WHEN 'm' THEN 'MATERIALIZED VIEW'  ELSE NULL  END  ELSE NULL  END  AS TABLE_TYPE, d.description AS REMARKS,  '' as TYPE_CAT, '' as TYPE_SCHEM, '' as TYPE_NAME, '' AS SELF_REFERENCING_COL_NAME, '' AS REF_GENERATION  FROM pg_catalog.pg_namespace n, pg_catalog.pg_class c  LEFT JOIN pg_catalog.pg_description d ON (c.oid = d.objoid AND d.objsubid = 0  and d.classoid = 'pg_class'::regclass)  WHERE c.relnamespace = n.oid  AND n.nspname LIKE 'public' AND c.relname LIKE 'people' AND (false  OR ( c.relkind = 'r' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'p' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'v' AND n.nspname <> 'pg_catalog' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'm' )  OR ( c.relkind = 'f' ) )  ORDER BY TABLE_TYPE,TABLE_SCHEM,TABLE_NAME 
2023-02-14 13:12:34.054 UTC [37165] LOG:  execute <unnamed>: SELECT * FROM (SELECT n.nspname,c.relname,a.attname,a.atttypid,a.attnotnull OR (t.typtype = 'd' AND t.typnotnull) AS attnotnull,a.atttypmod,a.attlen,t.typtypmod,row_number() OVER (PARTITION BY a.attrelid ORDER BY a.attnum) AS attnum, nullif(a.attidentity, '') as attidentity,nullif(a.attgenerated, '') as attgenerated,pg_catalog.pg_get_expr(def.adbin, def.adrelid) AS adsrc,dsc.description,t.typbasetype,t.typtype  FROM pg_catalog.pg_namespace n  JOIN pg_catalog.pg_class c ON (c.relnamespace = n.oid)  JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid)  JOIN pg_catalog.pg_type t ON (a.atttypid = t.oid)  LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND a.attnum = def.adnum)  LEFT JOIN pg_catalog.pg_description dsc ON (c.oid=dsc.objoid AND a.attnum = dsc.objsubid)  LEFT JOIN pg_catalog.pg_class dc ON (dc.oid=dsc.classoid AND dc.relname='pg_class')  LEFT JOIN pg_catalog.pg_namespace dn ON (dc.relnamespace=dn.oid AND dn.nspname='pg_catalog')  WHERE c.relkind in ('r','p','v','f','m') and a.attnum > 0 AND NOT a.attisdropped  AND n.nspname LIKE 'public' AND c.relname LIKE 'people') c WHERE true  ORDER BY nspname,c.relname,attnum 
2023-02-14 13:12:34.062 UTC [37165] LOG:  execute <unnamed>: BEGIN READ ONLY
2023-02-14 13:12:34.062 UTC [37165] LOG:  execute <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:12:34.067 UTC [37165] LOG:  execute fetch from <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:12:34.072 UTC [37165] LOG:  execute fetch from <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"

Step 3: Trino syncs again the table schema, now twice, and creates 2 cursors and also fires the query twice

2023-02-14 13:18:21.065 UTC [37412] LOG:  execute <unnamed>: SET extra_float_digits = 3
2023-02-14 13:18:21.065 UTC [37412] LOG:  execute <unnamed>: SET application_name = 'PostgreSQL JDBC Driver'
2023-02-14 13:18:21.067 UTC [37412] LOG:  execute <unnamed>: SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME,  CASE n.nspname ~ '^pg_' OR n.nspname = 'information_schema'  WHEN true THEN CASE  WHEN n.nspname = 'pg_catalog' OR n.nspname = 'information_schema' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TABLE'   WHEN 'v' THEN 'SYSTEM VIEW'   WHEN 'i' THEN 'SYSTEM INDEX'   ELSE NULL   END  WHEN n.nspname = 'pg_toast' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TOAST TABLE'   WHEN 'i' THEN 'SYSTEM TOAST INDEX'   ELSE NULL   END  ELSE CASE c.relkind   WHEN 'r' THEN 'TEMPORARY TABLE'   WHEN 'p' THEN 'TEMPORARY TABLE'   WHEN 'i' THEN 'TEMPORARY INDEX'   WHEN 'S' THEN 'TEMPORARY SEQUENCE'   WHEN 'v' THEN 'TEMPORARY VIEW'   ELSE NULL   END  END  WHEN false THEN CASE c.relkind  WHEN 'r' THEN 'TABLE'  WHEN 'p' THEN 'PARTITIONED TABLE'  WHEN 'i' THEN 'INDEX'  WHEN 'P' then 'PARTITIONED INDEX'  WHEN 'S' THEN 'SEQUENCE'  WHEN 'v' THEN 'VIEW'  WHEN 'c' THEN 'TYPE'  WHEN 'f' THEN 'FOREIGN TABLE'  WHEN 'm' THEN 'MATERIALIZED VIEW'  ELSE NULL  END  ELSE NULL  END  AS TABLE_TYPE, d.description AS REMARKS,  '' as TYPE_CAT, '' as TYPE_SCHEM, '' as TYPE_NAME, '' AS SELF_REFERENCING_COL_NAME, '' AS REF_GENERATION  FROM pg_catalog.pg_namespace n, pg_catalog.pg_class c  LEFT JOIN pg_catalog.pg_description d ON (c.oid = d.objoid AND d.objsubid = 0  and d.classoid = 'pg_class'::regclass)  WHERE c.relnamespace = n.oid  AND n.nspname LIKE 'public' AND c.relname LIKE 'people' AND (false  OR ( c.relkind = 'r' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'p' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'v' AND n.nspname <> 'pg_catalog' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'm' )  OR ( c.relkind = 'f' ) )  ORDER BY TABLE_TYPE,TABLE_SCHEM,TABLE_NAME 
2023-02-14 13:18:21.069 UTC [37412] LOG:  execute <unnamed>: SELECT * FROM (SELECT n.nspname,c.relname,a.attname,a.atttypid,a.attnotnull OR (t.typtype = 'd' AND t.typnotnull) AS attnotnull,a.atttypmod,a.attlen,t.typtypmod,row_number() OVER (PARTITION BY a.attrelid ORDER BY a.attnum) AS attnum, nullif(a.attidentity, '') as attidentity,nullif(a.attgenerated, '') as attgenerated,pg_catalog.pg_get_expr(def.adbin, def.adrelid) AS adsrc,dsc.description,t.typbasetype,t.typtype  FROM pg_catalog.pg_namespace n  JOIN pg_catalog.pg_class c ON (c.relnamespace = n.oid)  JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid)  JOIN pg_catalog.pg_type t ON (a.atttypid = t.oid)  LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND a.attnum = def.adnum)  LEFT JOIN pg_catalog.pg_description dsc ON (c.oid=dsc.objoid AND a.attnum = dsc.objsubid)  LEFT JOIN pg_catalog.pg_class dc ON (dc.oid=dsc.classoid AND dc.relname='pg_class')  LEFT JOIN pg_catalog.pg_namespace dn ON (dc.relnamespace=dn.oid AND dn.nspname='pg_catalog')  WHERE c.relkind in ('r','p','v','f','m') and a.attnum > 0 AND NOT a.attisdropped  AND n.nspname LIKE 'public' AND c.relname LIKE 'people') c WHERE true  ORDER BY nspname,c.relname,attnum 
2023-02-14 13:18:21.090 UTC [37412] LOG:  execute <unnamed>: BEGIN READ ONLY
2023-02-14 13:18:21.091 UTC [37412] LOG:  execute <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:18:21.096 UTC [37412] LOG:  execute fetch from <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:18:21.101 UTC [37412] LOG:  execute fetch from <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:18:21.382 UTC [37413] LOG:  execute <unnamed>: SET extra_float_digits = 3
2023-02-14 13:18:21.382 UTC [37413] LOG:  execute <unnamed>: SET application_name = 'PostgreSQL JDBC Driver'
2023-02-14 13:18:21.385 UTC [37413] LOG:  execute <unnamed>: SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME,  CASE n.nspname ~ '^pg_' OR n.nspname = 'information_schema'  WHEN true THEN CASE  WHEN n.nspname = 'pg_catalog' OR n.nspname = 'information_schema' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TABLE'   WHEN 'v' THEN 'SYSTEM VIEW'   WHEN 'i' THEN 'SYSTEM INDEX'   ELSE NULL   END  WHEN n.nspname = 'pg_toast' THEN CASE c.relkind   WHEN 'r' THEN 'SYSTEM TOAST TABLE'   WHEN 'i' THEN 'SYSTEM TOAST INDEX'   ELSE NULL   END  ELSE CASE c.relkind   WHEN 'r' THEN 'TEMPORARY TABLE'   WHEN 'p' THEN 'TEMPORARY TABLE'   WHEN 'i' THEN 'TEMPORARY INDEX'   WHEN 'S' THEN 'TEMPORARY SEQUENCE'   WHEN 'v' THEN 'TEMPORARY VIEW'   ELSE NULL   END  END  WHEN false THEN CASE c.relkind  WHEN 'r' THEN 'TABLE'  WHEN 'p' THEN 'PARTITIONED TABLE'  WHEN 'i' THEN 'INDEX'  WHEN 'P' then 'PARTITIONED INDEX'  WHEN 'S' THEN 'SEQUENCE'  WHEN 'v' THEN 'VIEW'  WHEN 'c' THEN 'TYPE'  WHEN 'f' THEN 'FOREIGN TABLE'  WHEN 'm' THEN 'MATERIALIZED VIEW'  ELSE NULL  END  ELSE NULL  END  AS TABLE_TYPE, d.description AS REMARKS,  '' as TYPE_CAT, '' as TYPE_SCHEM, '' as TYPE_NAME, '' AS SELF_REFERENCING_COL_NAME, '' AS REF_GENERATION  FROM pg_catalog.pg_namespace n, pg_catalog.pg_class c  LEFT JOIN pg_catalog.pg_description d ON (c.oid = d.objoid AND d.objsubid = 0  and d.classoid = 'pg_class'::regclass)  WHERE c.relnamespace = n.oid  AND n.nspname LIKE 'public' AND c.relname LIKE 'people' AND (false  OR ( c.relkind = 'r' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'p' AND n.nspname !~ '^pg_' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'v' AND n.nspname <> 'pg_catalog' AND n.nspname <> 'information_schema' )  OR ( c.relkind = 'm' )  OR ( c.relkind = 'f' ) )  ORDER BY TABLE_TYPE,TABLE_SCHEM,TABLE_NAME 
2023-02-14 13:18:21.387 UTC [37413] LOG:  execute <unnamed>: SELECT * FROM (SELECT n.nspname,c.relname,a.attname,a.atttypid,a.attnotnull OR (t.typtype = 'd' AND t.typnotnull) AS attnotnull,a.atttypmod,a.attlen,t.typtypmod,row_number() OVER (PARTITION BY a.attrelid ORDER BY a.attnum) AS attnum, nullif(a.attidentity, '') as attidentity,nullif(a.attgenerated, '') as attgenerated,pg_catalog.pg_get_expr(def.adbin, def.adrelid) AS adsrc,dsc.description,t.typbasetype,t.typtype  FROM pg_catalog.pg_namespace n  JOIN pg_catalog.pg_class c ON (c.relnamespace = n.oid)  JOIN pg_catalog.pg_attribute a ON (a.attrelid=c.oid)  JOIN pg_catalog.pg_type t ON (a.atttypid = t.oid)  LEFT JOIN pg_catalog.pg_attrdef def ON (a.attrelid=def.adrelid AND a.attnum = def.adnum)  LEFT JOIN pg_catalog.pg_description dsc ON (c.oid=dsc.objoid AND a.attnum = dsc.objsubid)  LEFT JOIN pg_catalog.pg_class dc ON (dc.oid=dsc.classoid AND dc.relname='pg_class')  LEFT JOIN pg_catalog.pg_namespace dn ON (dc.relnamespace=dn.oid AND dn.nspname='pg_catalog')  WHERE c.relkind in ('r','p','v','f','m') and a.attnum > 0 AND NOT a.attisdropped  AND n.nspname LIKE 'public' AND c.relname LIKE 'people') c WHERE true  ORDER BY nspname,c.relname,attnum 
2023-02-14 13:18:21.394 UTC [37413] LOG:  execute <unnamed>: BEGIN READ ONLY
2023-02-14 13:18:21.394 UTC [37413] LOG:  execute <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:18:21.399 UTC [37413] LOG:  execute fetch from <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
2023-02-14 13:18:21.404 UTC [37413] LOG:  execute fetch from <unnamed>/C_1: SELECT "id", "address", "email", "password", "name", "city", "longitude", "state", "source", "birth_date", "zip", "latitude", "created_at" FROM "public"."people"
lpoulain commented 1 year ago

Just to confirm, do you see this only when you save the question or also when you run it?

paoliniluis commented 1 year ago

Just when editing the question

lpoulain commented 1 year ago

It turns out Metabase is calling the Trino query twice. The first time is supposed to be just a cursor, but that feature doesn't exist with Trino.

Because there is no way to differentiate the call which should trigger a query and the call which does not (at least yet), closing this issue.