exyi / pg2parquet

Export PostgreSQL table or query into Parquet file
Apache License 2.0
57 stars 11 forks source link

Type source question #12

Closed mhkeller closed 9 months ago

mhkeller commented 9 months ago

This is an architecture question. What info are you pulling from postgres to determine its postgres type? I'm trying to do some parallel work in my project to better interface with this one and want to be as consistent as possible. At the moment I'm taking the udt_name from the information_schema.columns. What does this pull to determine its postgres type? I went through the source a bit but couldn't quite nail it down.

exyi commented 9 months ago

Sorry I forgot to reply to this. With each query, Postgres sends a header with the column names and types. With the Postgres client I'm using, it's client.prepare("select * from some_table")?.columns().

In the postgres protocol documentation, they call the message RowDescription. I suppose your postgres client library will have some API to get to this information.

However, if you know your query is always "select * from my_table", then using the information_schema.columns should be equivalent (AFAIK)

mhkeller commented 9 months ago

Thanks for the info! My client returns integers, which were a little inconsistent for geometry types. For anyone finding this in the future, calling from the information_schema works fine for tables but not for materialized views. Looks like you have to do a join between pg_attribute, pg_class and pg_namespace tables.

exyi commented 9 months ago

I see, Postgres only sends the OIDs over the wire. These are stable for built-ins but can be anything for user-defined types (geography is extension). It seems that the rust-postgres library queries the pg_type table anytime there it encounters an unknown type OID (source code), I didn't know that before!

mhkeller commented 9 months ago

Ah neat! I should compare their queries with what I’m currently doing to see if there are any advantages to keeping it based on the oid. Always helpful — thanks!