Open emmanuelbernard opened 5 years ago
I like this. This also lets us use or
instead of and
, right?
I like this. This also lets us use
or
instead ofand
, right?
yes
I do like how this looks, but I like the separation of predicates more. Separated predicates could still be checked too I believe? Just need to make sure that the root type is made explicit.
The weak aspects of this solution:
null
is not possibleBut don't take this as a blocker; just mentioning the limitations so that we're on the same page on why I prefer the separation of predicates.
What's your alternative syntax or API look like for the examples I gave @Sanne ?
I do think it's a good addition to Panache, but I don't think it removes the need to have a byExample
API.
A byExample
API can be used not only for the use case that are explained in the description, but as a full replacement for the textual query syntax by expressing a query programatively. This is IMHO the best advantage of a query byExample
having the hablity to express a Query in a compile checked and type safe fashion.
When you test everything with mocks, your test didn't hit the database and the query are never rendered. Having a query that is mechanically generated from an object allows to be sure that this query will be semantically exact.
But I'm not 100% fan of some query by example implementation that allows a user to specify any type as long as long as the type have the same attributes as the Entity (this will not longuer by compile checked and type safe) . I can see some advantages (when you have DTO with not all the fields) but I prefere a query by example that uses the same type as the Entity directly.
Something like this could be added to Panache:
BookEntity example = new BookEntity();
example.author="Victor Hugo";
List<BookEntity> victorBooks= BookEntity.findByExample(example).list();
An other more complex solution should be to provide a Criteria API to provides pragramatically query generation so one could write something like :
Criteria criteria = Criteria.eq("author", "Victor Hugo")
.and(Criteria.lte("rating", 10));
List<BookEntity> victorBooks= BookEntity.findByCriteria(criteria).list();
This will cover more use cases but add some more complexity in the implementation sides.
To conclude, I think that if we want to keep Panache simple, we should add a byExample
API like the one I described to add a way to programmatically express a compiled checked and type safe query
What if PanacheQL was compile time checked. It looks like it would alleviate all the concerns you have. CC @gavinking My concern with the example API is that you don't control which field is OK vs not OK and it becomes cumbersome to define what null or default values mean per property (ignore or set)?
Compile checking PanacheQL queries will be a good thing, but it assume it will not really be compile check but build time check (by the deployment module).
So there will be no such check via my IDE ...
Anyway, I still support an additional querying facility (example or criteria or something else) than via a simple String, because some people will prefere it. I think that at some point we may add it to Panache, maybe it's not the time now :)
No we do have annotation processor driven HQL validation we can repurpose for PanacheQL, it will show up in your IDE
I reviewed this issue and my proposal is to implement this and see how much the other approaches are still sorely missed. This would need to go hand in hand with https://github.com/quarkusio/quarkus/issues/5552 to achieve the target objective.
I understand the limitations:
Note that falling back to plain JPA is really easy. I can even see us offering entity.entityManager();
CC @FroMage
So, the proposal is to auto-remove AST nodes of the predicate
type when their LHS or RHS expression
is of type path
with a question mark:
predicate
: LEFT_PAREN predicate RIGHT_PAREN # GroupedPredicate
| predicate OR predicate # OrPredicate
| predicate AND predicate # AndPredicate
| NOT predicate # NegatedPredicate
| expression IS (NOT)? NULL # IsNullPredicate
| expression IS (NOT)? EMPTY # IsEmptyPredicate
| expression EQUAL expression # EqualityPredicate
| expression NOT_EQUAL expression # InequalityPredicate
| expression GREATER expression # GreaterThanPredicate
| expression GREATER_EQUAL expression # GreaterThanOrEqualPredicate
| expression LESS expression # LessThanPredicate
| expression LESS_EQUAL expression # LessThanOrEqualPredicate
| expression (NOT)? IN inList # InPredicate
| expression (NOT)? BETWEEN expression AND expression # BetweenPredicate
| expression (NOT)? LIKE expression (likeEscape)? # LikePredicate
| MEMBER OF path # MemberOfPredicate
;
expression
: expression DOUBLE_PIPE expression # ConcatenationExpression
| expression PLUS expression # AdditionExpression
| expression MINUS expression # SubtractionExpression
| expression ASTERISK expression # MultiplicationExpression
| expression SLASH expression # DivisionExpression
| expression PERCENT expression # ModuloExpression
// todo (6.0) : should these unary plus/minus rules only apply to literals?
// if so, move the MINUS / PLUS recognition to the `literal` rule
// specificcally for numeric literals
| MINUS expression # UnaryMinusExpression
| PLUS expression # UnaryPlusExpression
| caseStatement # CaseExpression
| coalesce # CoalesceExpression
| nullIf # NullIfExpression
| literal # LiteralExpression
| parameter # ParameterExpression
| entityTypeReference # EntityTypeExpression
| path # PathExpression
| function # FunctionExpression
| LEFT_PAREN querySpec RIGHT_PAREN # SubQueryExpression
;
path
: syntacticDomainPath (pathContinuation)?
| generalPathFragment
;
The "only" issue is that we'll need to use the HQL parser, which is probably a bit more expensive than what we do ATM. And we can't use it because it will reject identifiers with ?
at the end, but we forked it so we can actually add a special node type for it.
It's possible we'll want to start expansing those PanacheQL fragments at build-time to cut run-time costs, though probably they are actually moot since Hibernate will parse them anyway later on.
I think the parser you describe is what @loicmathieu has done for MongoDB right? At least I see the panacheql module with a grammar. While I'm no grammar guy, I think supporting the ?
and behaving differently on ?
path should not require too much extra work. Or am I naive?
I guess complex nested and / or could lead to empty () so the query would need to be rewritten or replace it with ( 1 == 1 )
?
While discussing with @gavinking, he advised to come with a formal set of rules for how the feature should operate. I finally got the time to sit down and think about it.
CC @loicmathieu @FroMage
Construct query structures depending of the presence or absence of a parameter. Often, a complex search query is build from if statements and string concatenation depending on the presence or absence of an input parameter (absence materialized by null or some other marker).
firstname? = ?1 and lastname? = ?2
?1 == null
is equivalent to lastname = ?2
lastname? = ?1 or status? = ?2
?1 == null
is equivalent to status = ?2
(firstname? = ?1 and lastname? = ?2) or status? = ?3
?1
and ?2
to null
is equivalent to status = ?3
you can mix optional predicates and non optional ones.
firstname? = ?1 and lastname = ?2
?1 == null
is equivalent to lastname = ?2
?1 != null
is equivalent to firstname = ?1 and lastname = ?2
I explored a few options but in the end Stephane had it the most right with his AST proposal.
An optional predicate including a parameter value which is null is removed from the AST node and it's parent OR
or AND
node is replaced by the sibling of the optional predicate eliminated.
Said differently, replace AND and OR nodes with the non optional node. If 2 optional nodes are attached to the AND / OR, keep the left one.
If the AST of the where clause ends up being empty, there are two options:
where 1=0
)I haven't made up my mind, I am concerned about the fact that a query could mistakenly return more things than expected in this extreme condition but that's an arbitrary rule.
NOTE: we can't just say that if all parameters are null, we return nothing (or don't query) because this query is valid status = 'ADMIN' and lastname? = ?1
NOTE: it's the whole predicate, not just path = ?param
, it could apply to functions using a parameter and other forms. So all elements of the node are eliminated up to the AND / OR operator.
A few tree pruning examples:
OR
/ \
AND S?
/ \
L? F?
assuming the parameter for L is null, is equivalent to
OR
/ \
F? S?
assuming also that the parameter for F is null, is equivalent to
S?
Another example
OR
/ \
AND S?
/ \
L? F?
assuming the parameter for S is null, is equivalent to
AND
/ \
L? F?
I did not formally prove it but I am pretty sure that regardless of the tree walking algorithm, the pruning algorithm converges to the same results.
I did try another approach which was to replace the optional predicate with 1=1
if it is attached to AND
and 1=0
if it is attached to OR
.
The idea was to use a custom ternary logic to achieve the tree simplification.
Unfortunately it does not work for nested elements.
(lastname? = ?1 and firstname? = ?2) or status? = ?3
becomes (1=1 and 1=1) or 1=0
where one would have expected that if all predicates are optional null in the and clause, the total result would be 1=0
.
We need to retain the info that a predicate is opted out to carry that info.
I am not super happy about the syntax path?
because someone could write something like firstname? = lastname?
or firstname? = 'foo'
.
Here is a verbose but maybe more accurate syntax: optional( firstname = ??, ?1 ) and optional( lastname = ??, ?2 )
where the first operand of optional
is the predicate to be eliminated and the second one is the actual parameter used to evaluate the optionality.
Alternative syntaxes
optional( firstname = ??, ?1 ) and optional( lastname = ??, ?2 )
optional( firstname = ?1 ) and optional( lastname = ?2 )
firstname? = ?1 and lastname? = ?2
firstname = ??1 and lastname = ?:somename
firstname = ?1? and lastname = :somename?
firstname = [?1] and lastname = [:name]
firstname = [?1]? and lastname = [:name]?
[firstname = ?1]? and [lastname? = ?2]?
: show more the predicate limit being eliminated. but still verbose8 and 1 are the most accurate as they select the whole predicate.
4, 5, 6 and 7 are quite accurate as they highlight the parameter being evaluated for nullness.
I do like the elegance of 3 because it reminds me of Ceylon's ?.
There you go, we can discuss the semantic now.
I like the conciseness of path? = :param
, but it's indeed not ideal, because it sort of implies that the condition depends on the path
propery being null
, where it's really about the :param
. This is what leads you to be afraid of people writing firstname? = lastname?
.
I initially thought we could write path ?= :param
and tie it to the operator, but we'd have to support every operator that takes a parameter, which may be confusing.
So, it would feel less ambiguous to mark specially the parameter that can be null, perhaps by changing the :name
syntax to ?name
, but that leaves positional params out, unless we use ??0
which is really hard to justify.
I do like the proposal of @FromMage to put the optional caracteristic of a predicate on the operator and not the field path, this better describe what is an optional predicate: a variant of an existing predicate. With this, the implementation will be more straightforward, and the intent clearer inside the query.
The issue is on the implementation side.
Ideally we should be able to work on the AST level to prune nodes that should disapear due to the optionality. That is what @emmanuelbernard describe in his comment (by the way, thanks for it, it adds a lot of clarity for what needs to be done).
If you look at my implementation for MongoDB you can see that it's not what I did.
In fact, the current PanacheQL implementation inside MongoDB with Panache uses an Antlr visitor, the visitor is not capable or pruning an AST node (it can only replace it, not remove it), and not capable of updatong the parent node. Idealy we must act on the AST level in multiple passes (mark pass and delete pass) and I don't know who to do this with Antlr.
So, the implementation is harder that what it seems first, if we cannot simply replace a node by something else (empty document for MongoDB, 1=1 where clause for Hibernate) the implementation will be very much complex ...
What about
"name = optional(?1) or status = optional( :status )"
It is more verbose but very clear.
I'd like to ask for @gavinking's opinion about syntax. He's pretty good at throwing ideas in and out.
Hrm. I've been tagged so many times in this issue, but somehow I never got a proper notification until Stef sent me a message today. Need to check my filters.
Alright, so my first reactions are:
?
for this. I feel like ?
already means lots of things to me, and none of them are this, precisely. And the actual code snippets you've written above look all a bit too Perly for my aesthetic sensibilities.[
... ]
. Now, brackets already mean something in HQL, so we need to be careful that doesn't result in ambiguity, but actually I don't think it does.So I guess I like the look of:
[title like ?1] and [owner = ?2] and [order < ?3] order by order
I find this very readable.
However, there is one thing that's wrong with that: the logical connectives should really also go inside the brackets, yielding:
[title like ?1] [and owner = ?2] [and order < ?3] order by order
However, I'm not sure I like that better.
OTOH, consider the extension to multiple predicates within a single optional condition:
title like ?1 [and owner = ?2 and order < ?3] order by order
WDYT?
(P.S. I know I was the one advocating that we treat this as some sort of pseudo "function" in HQL, and I can see I'm sorta walking back from that now, at least with respect to the syntax.)
Potential objection of using brackets is that it hammers the natural syntax for list literals. But I highly doubt we'll ever want list literals in HQL.
I'm not very keen on putting the marker on any of the three individual bits of the predicate. Since it's the whole predicate that gets removed, it seems to me that the clearest thing is to place delimiters around the whole predicate.
Well, in my mind, this is a two-actor team:
=
) acquires a new ternary result meaning which can be true/false/NOOP. true/false is determined by the DB unless NOOP. NOOP is determined at query build time if one of the operand of the operator is a missing/null parameter.and
and or
acquire the same true/false/NOOP result meaning, where NOOP is determined at query build time if one of the operand of the operator is a NOOP.So the semantics really belong mostly on the operator, because it's that operator which changes semantics if we opt in or not:
and
and or
will get the new semantics regardless of whether its operands have the new semantics, because they're both required to handle eventual NOOP operands: it makes no sense to allow oping in or out of the new semantics at that level.@FroMage Yeah, I originally tried to go down that path when I chatted with Emmanuel, but I think we decided that no, that would be adding a new sort of ternary logic that collided with the ternary logic of SQL and was broken in novel and different ways to SQL's ternary logic.
So after a bit of a think about it I decided it would probably be a mistake.
So, with that in mind, I guess I'd favour the following variants:
[title like ?1] and [owner = ?2] and [order < ?3] order by order
optional(title like ?1) and optional(owner = ?2) and optional(order < ?3) order by order
The Perl user in me finds [owner = ?2]
confusing, while the BNF user in me finds it perfectly fitting.
So, with that in mind, I guess I'd favour the following variants ...
To be clear, I think either of those options are perfectly acceptable outcomes. They're very readable.
OTOH, the option of putting the operator inside the brackets does have the advantage that we don't have to come up with strange and half-convincing backsplanations as to what is the meaning of a dangling and
or or
operator.
(The issue being that or [name=?n]
means or false
when n
is missing, while and [name=?n]
means and true
when n
is missing.)
Rrrrm the more I stare at this example, the more I think it's telling us that the logical operator should go inside:
title like ?1 [and owner = ?2 and order < ?3] order by order
(The issue being that
or [name=?n]
meansor false
whenn
is missing, whileand [name=?n]
meansand true
whenn
is missing.)
Not even, see this loooong post https://github.com/quarkusio/quarkus/issues/2303#issuecomment-622331293
So, with that in mind, I guess I'd favour the following variants:
* `[title like ?1] and [owner = ?2] and [order < ?3] order by order` * `optional(title like ?1) and optional(owner = ?2) and optional(order < ?3) order by order`
The Perl user in me finds
[owner = ?2]
confusing, while the BNF user in me finds it perfectly fitting.
So I considered the Gavin bracket syntax and was scared it would conflict with HQL but if it does not, I think that's the winner.
I need to let the [and foo = ?1]
variant but I don't think that's quite that. It would be something like
[( [a = ?1] and [b = ?2]) [or] c = ?3]
which is just wrong. Remember the operation moves up the tree.
Any progress of this?
I would like to use Typesafe APIs such as JPA Creteria(Specification in Spring Data JPA) or QueryDSL to assemble the Query logic by pure Java codes instead of literal strings.
I came here after posting this To me, this is a key feature that is missing. All the "easy" and "simplified" panache ORM examples do not meet the requirements for most real world applications. This should be prioritized.
Any updates on this ?
@antssilva96 no update on it, you can see https://github.com/quarkusio/quarkus/issues/8136 that tries to sum up all the ideas around dynamic query support in Panache.
I open a discussion on Zulip as I think we need to move forward or decide to close all these issues: https://quarkusio.zulipchat.com/#narrow/stream/187038-dev/topic/Panache.20dynamic.20query
Any update ?
What people use the example API for is for generic "search" screens that need to accept between 1 to n predicates depending on what the user as filled in. And the example API does allow to do such matching assuming the search fields are all in one entity as properties and by adding side metadata on what property needs to be ignored, ignored if null, have a trailing % in like etc.
The following Panache model would scale better. I call it the destructured example API
The trailing
?
per property expresses that the predicate on the property should be ignored if the property is null.The advantages I see to this approach are:
Thoughts @gavinking @Sanne @gsmet @FroMage ?