Closed breathe closed 1 year ago
One pain point though is that I would like to be able to directly execute my sql queries for ease of testing and prototyping -- and I need some way of supplying appropriate values and value_types to ppgen.arg().
The alternative syntax for params suggested in https://github.com/jschaf/pggen/issues/42 is slowly growing on me.
As for what I personally use: some editors allow you to define a regexp for a customizable parameter syntax. Here's how to do it for IntelliJ-based editors:
Instead of copying queries, you can run pggen queries directly against any IntelliJ datasource by defining a regexp to match pggen.arg('val')
. IntelliJ calls these User Parameters. Typically, these look like ? or :name but IntelliJ is flexible enough to match the pggen syntax.
First, create a new parameter pattern with the regexp.
pggen.arg\('([^']+)'\)
Does that approach work for you?
Does that approach work for you?
I appreciate the workflow info! The jetbrains rewrite rule seems like an 'ok' workaround but that approach preserves a number of pretty annoying papercuts ...
As I consider your suggestion and think about the two arg approach again, I'm even more leaning towards the two arg approach ...
Consider a workflow like this:
CREATE TYPE rental_user AS (
id integer,
first_name text,
last_name text
);
CREATE OR REPLACE FUNCTION pggen.arg(param text, default_type_or_value ANYELEMENT)
RETURNS anyelement
LANGUAGE plpgsql
AS $func$
BEGIN
IF param = 'name' THEN
RETURN 'bob';
ELSIF param = 'age' THEN
RETURN 12;
ELSIF param = 'user' THEN
RETURN ROW(1,
'bob',
'lasty')::rental_user;
ELSIF param = 'ids' THEN
RETURN '{1,2}'::integer[];
ELSIF param = 'sort' THEN
RETURN 'schemaname';
ELSIF param = 'sort_direction' THEN
RETURN 'ASC';
ELSIF param = 'limit' THEN
RETURN 6;
ELSE
RETURN default_type_or_value;
END IF;
END
$func$;
SELECT
pggen.arg('name', NULL::text) AS name,
pggen.arg('age', NULL::integer) AS age,
pggen.arg('user', NULL::rental_user) AS user,
pggen.arg('ids', NULL::integer[]) AS ids,
pggen.arg('sort_direction', NULL::text) AS sort_direction
FROM
pg_tables
ORDER BY
CASE WHEN pggen.arg('sort', NULL::text) = 'schemaname' THEN
schemaname
ELSE
tablename
END
LIMIT pggen.arg('limit', NULL::bigint);
CREATE OR REPLACE FUNCTION pggen.default_user() returns rental_user ...
This approach could maybe also provide a solution for -- https://github.com/jschaf/pggen/issues/85 ... I think it could make sense for the type inference for the two argument ppgen.arg()
to always map the argument to a go nullable type -- and if nil is provided to generate a query that resolves to the default value supplied in second argument ...
eg -- select pggen.arg('foo', 1::integer)
maps foo
to the go type *int
and compiles to the the query select coalesce(:1, 1)
... This would give a solution to nullable types and also provide a solution for defining default values for query arguments (which is also something I want to be able to do in order to construct the most desirable go api for each query I write).
If this approach was embraced, maybe there could even future features added building on this -- maybe a special comment directive for the declaration of pggen.arg
-- not 100% sure what it would do but can think of various ideas -- maybe it would generate a go func that can be used inside a parameterless (on go side) test, or a go type that produces an explain query ... (just spitballing but I know I'd like a feature to generate an 'explain statement' so that I can write tests which assert expected/valid query plans ...)
Funnily enough, I initially prototyped support for pggen.arg with a default value but never exposed it: https://github.com/jschaf/pggen/commit/363cf19563c6f8fcc9cdc59b20b7d27fc81e2f85
The reasons I haven't implemented pggen.arg with a default:
Nil-everything: For the two-arg approach of pggen.arg('limit', NULL::bigint)
, we'd have to make every input arg a non-pointer. Otherwise, for data types like int, there's no way to distinguish between the unset and a zero value. I took inspiration from protobuf where primitive values don't have a nil value, just a zero value.
Parsing: To do it properly would require a better parser. Simple things like pggen.arg('foo', 2)
are easy enough. But it's not clear where to draw the line. Should pggen.arg support any scalar value? pggen.arg('foo', exists(SELECT 1 FROM ...))
. Currently, pggen's "parser" only recognizes strings, semicolons, and comments--which is the minimum syntax needed to identify a complete query.
It turned out to be easy enough to work around in SQL.
SELECT
nullif(pggen.arg('foo'), ''), -- to use a null default value
coalesce(nullif(pggen.arg('foo'), ''), 'default_value) -- to use a default value
Suitable values for potential arguments live outside the codebase
I think using the nullif
trick embeds the default value in the query.
At the design level, I'm wary of expanding Pggen's syntax to support interactive use-case. I'm sympathetic since I've done regexp surgery to convert pggen queries to support interactive queries.
I don't think the following helps the interactive use-case but it gives an idea for pggen's future direction:
Long term, I think the way forward is more user-configurable type-mappers at three levels. At present, type overrides are only supported at the invocation level.
Then a user can use different type mappers to support null or default values. I should start a pggen "v2" tracking ticket for these ideas.
Nil-everything: For the two-arg approach of pggen.arg('limit', NULL::bigint), we'd have to make every input arg a non-pointer. Otherwise, for data types like int, there's no way to distinguish between the unset and a zero value. I took inspiration from protobuf where primitive values don't have a nil value, just a zero value.
Don't you ever run into cases where you have an argument with a zero-value which is inappropriate for the query ...? The text
example you gave seems like one of the few cases where the zero-value is likely to be usable with the intended semantics ... It doesn't seem like there's a way to represent a query with an optional integer parameter at all since 0 would be a very frequently meaningful query argument ... For parameter types like integer[]
-- the inability to distinguish the empty array from null also seems generally annoying/error prone to me ... I think I'd rather make all arguments have pointer type and define my queries to either blowup if provided nil or define the query to supply an appropriate default semantic for any optional parameters when given nil ...
Parsing: To do it properly would require a better parser. Simple things like pggen.arg('foo', 2) are easy enough. But it's not clear where to draw the line. Should pggen.arg support any scalar value? pggen.arg('foo', exists(SELECT 1 FROM ...)). Currently, pggen's "parser" only recognizes strings, semicolons, and comments--which is the minimum syntax needed to identify a complete query.
Maybe I'm wrong here ... but could a simple approach work here? Keep count of the non-string-non-comment-enclosed open/close parenthesis after the ',' until finding a closing paren which represents the end of the ppgen.arg()
expression ...? (and also disallow use of ppgen.arg(
within the default value expression)
Maybe I'm wrong here ... but wouldn't it be correct to just keep count of the non-string-enclosed open/close parenthesis after the ',' until we find the closing paren that represents the end of the ppgen.arg() expression ...? (and also disallow use of ppgen.arg( within the default value expression)
I made a sketch of what I was thinking -- https://github.com/breathe/pggen/commit/fece1febc4ade79bc8333b8e2eb5432d1a1190c6
That implementation puts logic into the tokenizer to scan ppgen.arg(...)
into a new Directive
token type and then modifies the parser to extract the name and default value expression from the Directive
token ...
Maybe an approach along these lines could work without a huge amount of additional complexity ...?
(Note the existing scanner and parser tests pass but I didn't exercise the two-argument version at all ... am just wondering what you think of this approach / whether you would consider accepting a change like this ...)
Thanks for experimenting with the code! Adding a Directive
is a good idea as I'd like to eventually support dynamic predicates: pggen.predicate
.
It seems like the main problem is there's no way to pass null
as an input with pggen.arg
.
If pggen could pass null
:
coalesce(pggen.arg('foo'), 'some_text')
.pggen.arg
with a case statement would work since it could emit null
, and the default value is handled by the coalesce, which lives in the query.Let me know if that matches your understanding. If it does, I'd prefer to figure out how to pass null values via pggen.arg
since it's a more general mechanism and avoids expanding the API surface. A few other feature requests also boil down to supporting more flexible marshaling for pggen.arg
and unmarshaling for the returned column types, including:
2. Your example of defining
pggen.arg
with a case statement would work since it could emitnull
, and the default value is handled by the coalesce, which lives in the query.
But the case statement example only supports string typed arguments. afaict the two arg version is needed for the interactive use case as it's the only way to be able to define a pggen.arg() stub that can return values for arguments with arbitrary types associated with the arg name (?)
I still see value in the two arg version of the directive ... even with the addition of more type customization for argument types -- it would still be nice to have a version of the directive that always infers an optional type ... the two arg version could behave that way ...
pggen.arg('foo') -- keeps current implementation
pggen.arg('foo', null::int) -- infers optional type for 'foo', compiles to 'coalesce(:1, null::int)'
This could be added without breaking backward compatibility I think ... And I don't think this enhancement would add any major surprises to the implementation of any of the proposed new type inference customization features ...?
(even with the ability to customize the inferred go argument types, I'd prefer not writing those in general -- especially if the only reason I have to write them is to get an optional type ...)
I went ahead and wrote a PR to implement ... Let me know what you think if you have a chance?
Thanks!
Closing as I gather there is not interest in supporting the interactive sql use case
I came across pggen today and I'm enjoying it so far - I like static queries and the vision here.
One pain point though is that I would like to be able to directly execute my sql queries for ease of testing and prototyping -- and I need some way of supplying appropriate values and value_types to
ppgen.arg()
.I initially did something like this
The idea there was to define a fake
pggen.arg()
method which returns different string values based on the providedparam
. This gives a workflow for interactively prototyping a query -- and when want to test different query arguments I just change the value in the fake definition ofpggen.arg()
and re-execute the query.There are two problems with that:
I'm wondering if we could add support to pggen for a two argument version of
pggen.arg
-- like below:This would allow for writing queries that are consumable by pggen but which cann still be executed interactively for ease of development and feedback speed.
For the above, when the query is run interactively
pggen.arg(text, integer[])
will be invoked and the integer[] value will be returned, meanwhile the compiled query will take the value of the argument from the corresponding prepared statement placeholder as normal.