JordanMarr / SqlHydra

SqlHydra is a suite of NuGet packages for working with databases in F# including code generation tools and query expressions.
MIT License
212 stars 20 forks source link

Is it possible to support materialized views from Postgres? #84

Closed Janna112358 closed 1 month ago

Janna112358 commented 4 months ago

Hi!

We're using a Postgres database with some views to read from. I would like to make them materialized views instead for performance reasons (the underlying data isn't updated very often). It seems that sqlhydra.cli doesn't generate any types for newly added materialized views. I also tried removing an existing view and adding a materialized view with the same name and columns in the db, and using the already generated types file, but that resulted in runtime errors in the sqhlyra.query code.

Is support for materialized views wanted and is it even possible?

JordanMarr commented 4 months ago

Hi Janna,

If you can query the materialized view metadata, it should be possible.

https://github.com/JordanMarr/SqlHydra/blob/main/src/SqlHydra.Cli/Npgsql/NpgsqlSchemaProvider.fs

Janna112358 commented 4 months ago

I made a little test program and I am able to get the metadata on the materialized views in my local db just fine with:

conn.GetSchema("MaterializedViews")

and with the same mapping as done for the regular views in line 102 I can get the database name (Catalog), schema name and table name as expected.

Not sure what happens if we treat materialized views exactly the same as regular views in the rest of the code. Querying them in plain Postgres does work the same 🤔

I would like to try this out in the actual code, but I would need some help getting started with the dev environment. I'm able to open the dev-container thing, but after that the amount of warnings and errors that come up are a bit overwhelming 😨

JordanMarr commented 4 months ago

I think the dev Container needs some work. Don’t use it. Just do “docker-compose up”. Unfortunately it will take like 10m for the Oracle db to setup the first time, so just walk away and make coffee or something. (It will be super fast after that.)

JordanMarr commented 4 months ago

Any luck?

JordanMarr commented 4 months ago

I was just in the code base today, so I added this for you to try as a beta release: https://github.com/JordanMarr/SqlHydra/discussions/85

Janna112358 commented 4 months ago

The oracle setup was running for about 3 hours on Tuesday and didn't finish 😮 It hadn't printed any new messages for a while when I stopped it. (Then my day ended and I was off yesterday...). I will give it another whirl today in a detached terminal and see what happens.

Thanks for the update! I will try it out and let you know how it goes 😄

Janna112358 commented 4 months ago

I updated sqlhydra.cli to version 2.4.0-beta.1 but there don't seem to be any changes to the generated types file (except for the comment that says what hydra version was used to generate the file).

EDIT: I have a minimal test project to try out the beta release.

No luck with docker compose up so far. This is the last part of the output I get, after that it just seems to stall? image

JordanMarr commented 4 months ago

When I added the "MaterializedViews" query, it did pull back a few materialized views from my posgres AdventureWorks database. However, none of them had a columns, which generated table records with no columns (which will create a build error). So, I added code to ignore any tables/views that had no columns.

Currently, columns are pulled from here and then associated back to the table/view:

let sColumns = conn.GetSchema("Columns", cfg.Filters.TryGetRestrictionsByKey("Columns"))

Perhaps there is a different call to pull back postgres columns?

Re: Docker issues: Since you are not concerned with Oracle (and Oracle is always the problem child in this project), you could try just loading the postgres stuff by itself. I think you can do that with:

docker-compose up --build postgresql

But TBH, you probably can probably just figure out how to pull the materialized view columns using your own db instance.

Janna112358 commented 4 months ago

That's interesting! I'll keep investigating with the columns. It might take a while cause some more urgent things just came up.

RE docker: Let's park it for now. I'll add the Npgsql code to get metadata from materialized views in the same test project so it's visible.

Janna112358 commented 4 months ago

Alright so, the materialized view columns aren't showing up currently, cause they're not part of the postgres "information schema" (no idea why, docs say so). But yes there is another way, I boggled together the following query:

SELECT 
    pg_namespace.nspname AS table_schema,
    pg_class.relname AS table_name, 
    pg_class.relkind, 
    pg_attribute.attname AS column_name,
    pg_attribute.attnum AS ordinal_position,
    pg_type.typname AS data_type,
    pg_attribute.attnotnull AS not_null
FROM pg_class 
INNER JOIN pg_namespace on (pg_class.relnamespace = pg_namespace.oid) 
INNER JOIN pg_attribute on (pg_class.oid = pg_attribute.attrelid)
INNER JOIN pg_type on (pg_attribute.atttypid = pg_type.oid)
WHERE 
    -- get ordinary tables (r), views (v), and materialized views (m)
    relkind in ('r', 'v', 'm') AND
    -- filter out any "weird" columns 
    pg_attribute.attnum >= 1 AND
    -- filter out internal schemas
    pg_namespace.nspname not in ('pg_catalog', 'information_schema')
ORDER BY 
    table_schema, 
    table_name, 
    ordinal_position
;

I've put this in the test project with reading in fsharp, in the same way as is done for the primary keys. Only issue I had was that the pg_catalog doesn't give a relation - at least not one that I can find - between the tables/schemas/columns and the database name. So I left the "TableCatalog" field empty. I think that it should only give you information for the database that's set in the connection, though, so maybe it's not necessary to filter on it?

JordanMarr commented 4 months ago

When i run this query against the sample db, most of the columns are excluded because they have weird data types that aren’t mapped. Can you show me the column results for your materialized views?

Janna112358 commented 4 months ago

image

I was able to map them to strings, bools, and integers. The ordinal_position (pg_attribute.attnum) had to be cast to Int16 and then to int: https://github.com/Janna112358/db-experiments/blob/main/tryNpgsql/Program.fs

JordanMarr commented 4 months ago

It looks like that query uses different data types.

Here are my mappings: https://github.com/JordanMarr/SqlHydra/blob/main/src/SqlHydra.Cli/Npgsql/NpgsqlDataTypes.fs

They are based on the Npgsql documented mappings here: https://www.npgsql.org/doc/types/basic.html

You can see that there is no varchar or int4 listed. image

Janna112358 commented 4 months ago

Ah I see what you mean! Apparently with this query, we get some aliases of the "proper" type names. They're in this table from the Postgres docs.

Could we add the aliases from the Postgres table to your mapping?

JordanMarr commented 4 months ago

Yeah, we can definitely add those now that you found the list.

Janna112358 commented 4 months ago

Hey Jordan, just fyi I'm going on holiday for 2.5 weeks so I won't be able to work on this for a while.

JordanMarr commented 4 months ago

Enjoy your vacation! I will probably group the materialized view column mappings in with some other changes that I am still working on within the next week or two.

JordanMarr commented 4 months ago

v2.4.0-beta.2 is published.

Janna112358 commented 3 months ago

This is awesome! I've updated to 2.4.0-beta.2 in the test project and it totally works 🥳

As expected, types are added for the materialized view, and I am able to query the materialized view with sqlhydra.Query in the same way as a normal view :)

Janna112358 commented 3 months ago

Follow-up: if I use sqlhydra.cli 2.4.0-beta.2 in a dotnet 6.0 project, I don't get the materialized view in my generated file. (I can't spot any other differences). Does that make sense, and could it be supported in net6?

JordanMarr commented 3 months ago

That’s because it uses an older version of Postgres. I can’t remember if there is a good reason for that though.

Janna112358 commented 3 months ago

Ah, well good to know that's explained then :) Updating to dotnet 8 is on my to-do list anyway, so I guess this adds to the incentive 😛