jwills / buenavista

A Postgres Proxy Server in Python
Apache License 2.0
232 stars 24 forks source link

Postgres system tables simulation #22

Open wrb2 opened 1 year ago

wrb2 commented 1 year ago

What is the level of Postgres emulation that Buena Vista aims for?

For example, if I do \d in psql to get list of tables, I get following error:

Catalog Error: Scalar Function with name pg_get_userbyid does not exist!
Did you mean "pg_get_expr"?
LINE 1: ...HEN 'partitioned index' END AS "Type", pg_catalog.PG_GET_USERBYID(c.relowner) ...

That is because the pg_ tables don't exist. DBeaver for example is broken when browsing the database as well, since it relies on them being there:

Screenshot 2023-07-12 at 14 10 46

Would it make sense to add simulation of those tables into Buena Vista or is it out of scope and only the wire protocol is the point?

jwills commented 1 year ago

Ah, great question.

So many of the pg_catalog tables do exist; DuckDB has implementations for many of them here: https://github.com/duckdb/duckdb/blob/master/src/catalog/default/default_views.cpp

The problem, of course, is that it doesn't support all of them, and even to the extent that it does, there are conceptual gaps that are hard to get around (e.g., what roles should DuckDB have?)

I don't think I (or DuckDB) can reasonably implement every thing that every BI/DB query tool needs to work with postgres in all of it's glory and complexity. On the other hand, I do think it might be possible to fill out the Presto/Trino protocol stuff (in https://github.com/jwills/buenavista/tree/main/buenavista/http ) to the point where I could (at least for any tool that knows how to talk to Presto/Trino, which isn't everything but is a lot of things) primarily b/c both the protocol and the metadata catalog are so much simpler. I've been working on some other things (https://github.com/jwills/dbt-duckdb alone keeps me reasonably busy), but when I have some more free time I would like to explore making the http protocol into a first-class thing.

wrb2 commented 1 year ago

Trino protocol might have similar challenge as the JDBC driver relies on views in system.jdbc catalog and schema but at least there's less of them than of the pg_whatever stuff and they seem to be just renames views from information_schema.

jwills commented 1 year ago

yeah exactly, it wasn't too-too hard to write them: https://github.com/jwills/buenavista/blob/main/buenavista/examples/duckdb_http.py

geoHeil commented 11 months ago
Bildschirmfoto 2023-10-10 um 16 46 43

@jwills I think I am observing a smilar issue when trying to use datagrip from Jetbrains - the raw SQL can be executed but DataGrip will not recognize the schema and offer any UI assistance - in fact it only recognizes an unnamed empty schema - even though duckdb itself will not show it

geoHeil commented 11 months ago

@jwills I tried https://github.com/jwills/buenavista/blob/main/buenavista/examples/duckdb_http.py but do not understand how to set it up to retrieve JDBC compliant connections

jwills commented 11 months ago

Hey @geoHeil for the duckdb_http stuff you want to use the presto connector in datagrip

geoHeil commented 11 months ago

This has a very similar issue that a) only unnamed schema and b) no tree drill down work for me

Bildschirmfoto 2023-10-11 um 08 57 17

I have connected with host, port and by filling in any random test user as the user.