tizoc / ppx_pgsql

Syntax extension for embedded SQL queries using PG'OCaml.
BSD 3-Clause "New" or "Revised" License
52 stars 9 forks source link

ppx_pgsql's nullability heuristic has failed for outer joins #4

Open NightBlues opened 5 years ago

NightBlues commented 5 years ago

"ppx_pgsql's nullability heuristic has failed" for outer joins For example if we have 2 tables:

CREATE TABLE authors (id serial PRIMARY KEY, name varchar(255) NOT NULL);
INSERT INTO authors (id, name) VALUES (1, 'John Doe');
CREATE TABLE books (id serial PRIMARY KEY, title varchar(255) NOT NULL, author int NOT NULL REFERENCES authors(id) ON DELETE CASCADE);

Following query will have type string * string instead of string * (string option):

[%sqlf {|
SELECT
 authors.name,
 books.title
FROM authors
LEFT OUTER JOIN books ON books.author = authors.id
|}]

Because books.title is NOT NULL, but author without any book will cause nullability heuristic has failed May be there is a way to advice nullability for ppx_pgsql?

tizoc commented 5 years ago

Thank you @NightBlues. This is a bit complicated, and I don't have a good solution right now, will have to check during the weekend to see if I can figure it out.

Meanwhile, something that you can do is to create views for such joins, and manually set the nullable property in postgres for each column (or just use the view as is, by default everything is nullable).

Here is the query that I use to make every column on a view non-nullable:

UPDATE pg_attribute SET attnotnull = 't'
 WHERE attrelid IN (
   SELECT oid FROM pg_class
    WHERE relname = 'name_of_view');

You can find documentation on the pg_attribute table here: https://www.postgresql.org/docs/11/catalog-pg-attribute.html

tizoc commented 5 years ago

To add a bit more of information. The reason modifying the contents of that table works is that it is from where ppx_pgsql gets the column type information. When you create a view, it gets a new ID, and all it's columns get new entries in that table. By modifying those entries you control what ppx_pgsql sees when fetching information for the columns on that view.

NightBlues commented 5 years ago

Thank you for workaround:)

NightBlues commented 5 years ago

There is more simple solution:

[%sqlf {|
SELECT
 authors.name,
 coalesce(books.title)
FROM authors
LEFT OUTER JOIN books ON books.author = authors.id
|}]

coalesce returns first non null value or null if all values are null, so we just confuse postgres with this function call:)

tizoc commented 5 years ago

Very interesting find! It didn't occur to me to use coalesce, I would have expected for postgres to infer the type of the call to be the same as the input (so, still nullable).

NightBlues commented 5 years ago

As I understand - the reason is that it becomes a calculated value instead of field of table, so describe returns None here https://github.com/darioteixeira/pgocaml/blob/master/src/PGOCaml_generic.ml#L1504 Also, I've found that pgocaml now has ppx and it has a flag for disabling heuristic (https://github.com/darioteixeira/pgocaml/blob/master/ppx/ppx_pgsql.ml#L159) but I can't use it because its seems to be not compatible with dune (or I don't understand how to use it from dune) :)