Closed sgpeter1 closed 1 month ago
That's odd, can you try run the following SQL and just confirm the tables column does not contain multiple tables ?
;WITH schemas AS (
select oid as schema_id, nspname as schema_name
from pg_catalog.pg_namespace
)
,tables AS (
select pgc.oid as table_id
,table_name
,table_type
,table_schema
,obj_description(pgc.oid) as comment
,reltuples as rows
FROM information_schema.tables t
JOIN pg_catalog.pg_class pgc ON t.table_name = pgc.relname
Where table_type = 'BASE TABLE'
)
select schema_id as id, schema_name as name,
array(
select table_name::text
from tables
where tables.table_schema = x.schema_name
order by table_name
) as tables
from schemas as x
order by schema_name;
It's a cut down version of the introspection script that is relevant for the tables, there is no joining so not sure what would cause them to duplicate.
The speed is due to the introspection SQL that is run when you connect to gather all the schema's and data for the database you connect to. It happens on each connect, but the first is the worst, after that it happens in the background and updates once complete.
Yes, there are duplicate identical listings in the result SQL!
So just to confirm, the rows are duplicates ? So the same schema is listed twice with the same data…
Or just within the tables column at the end there is duplicate table names ?
If I just run the tables
CTE, I see this duplication. Despite this, I don't see duplication in other SQL tools. I suppose they must be using a different introspection approach?
Ahh thanks, perfect… it’s the rows column redshift must have multiple rows in that table for each table
Okay there is nothing in the postgres docs about why there would be two rows for each table in there.
I am hoping it's something specific like the index and table share the same name, hence in the first script below I filtered by kind
Can you try these two and let me know if either stop the duplication, and show the right approx counts for the rows (eg not the 0.001
select pgc.oid as table_id
,table_name
,table_type
,table_schema
,obj_description(pgc.oid) as comment
,reltuples as rows
FROM information_schema.tables t
JOIN pg_catalog.pg_class pgc ON t.table_name = pgc.relname and pgc.relkind = 'r'
Where table_type = 'BASE TABLE'
select pgc.oid as table_id
,table_name
,table_type
,table_schema
,obj_description(pgc.oid) as comment
,pgc.reltuples as rows
FROM information_schema.tables t
JOIN
(
select Max(reltuples) as reltuples , oid, relname
from pg_catalog.pg_class
group by oid, relname
) pgc
on t.table_name = pgc.relname
Where table_type = 'BASE TABLE'
Unfortunately, both queries have duplication. Part of the issue is the the table_id
is unique between the duplicates.
Maybe it's a Redshift-specific issue? The Redshift docs have examples of catalog queries that might be helpful. Incidentally, I don't have access to the STV_TBL_PERM
, but I think you might be able to sidestep that table, especially given the note that "Amazon Redshift table names are stored in both PG_TABLES and STV_TBL_PERM; where possible, use PG_TABLES to return Amazon Redshift table names."
Ahh, oid's are duplicated...
Okay one more try.. can you run this:
select oid, relname, relnamespace
from pg_catalog.pg_class
When the tablenames (relname in that query) are duplicated with the different oid's are the relnamespace also different ?
Failing that I can drop the oid, we loose comments for the table though.
Yes, the relnamespace
is different, unfortunately.
Actually that's good news.. and it matches what I just replicated...
I think you might have (or redshift has done it for you) the same table names in different schemas... and this query was not handling that...
Last check... This query does not duplicate them right (or well it will, but they will show up in a different table_schema)
select pgc.oid as table_id
,table_name
,table_type
,table_schema
,obj_description(pgc.oid) as comment
,reltuples as rows
FROM information_schema.tables t
join pg_catalog.pg_namespace n on (table_schema = nspname)
JOIN pg_catalog.pg_class pgc ON n.oid = pgc.relnamespace and t.table_name = pgc.relname and pgc.relkind = 'r'
Where table_type = 'BASE TABLE'
No, the only columns that are different here for the duplicated records are table_id
and rows
.
Okay, can you try Version 1.2.2 and let me know if it solves the duplication ?
Duplication resolved! Thank you.
Thanks for all the testing, and patience!
Not sure what could be causing this, but there are duplicate tables listed with a Redshift connection. Sometimes one with 0 rows, sometimes one with partial rows.
Also it took a very, very long time to load the schema/table explorer.