Closed ayush-finix closed 1 year ago
Thank you very much for your feature request. How about this pattern?
Optional.of(TABLE_A)
.map(t -> b ? t.join(TABLE_B).on("b condition") : t)
.map(t -> c ? t.join(TABLE_C).on("c condition") : t)
.get()
Notice that when you're using inner join, there's no difference between putting a predicate in ON
or in WHERE
.
Notice, one important step here is to realise that SelectJoinStep.join()
is just syntax sugar / convenience for using Table.join()
methods (producing fewer parentheses), which are probably preferrable if you're doing dynamic SQL like this. I don't think we really need new API for your particular use case.
I'll leave this issue open as we might be able to collect more such use-cases. Perhaps there is a low hanging fruit for improving this experience.
Wrapping a fixed value just to call map on it seems a bit of an abuse =P.
Just for a little more context, the reason I brought this up was to essentially be able to see/build the entire query in one go and separate out the dynamic choice steps elsewhere, like how using noCondition allows you to do something like .where(bCond)
regardless of code path, so you don't end up reading DSL + java code mixed together [if that makes sense, sorry if I'm not explaining it well].
I'll definitely try using the table.join [didn't know this existed, thanks] directly and see if I can come up with something nice.
Wrapping a fixed value just to call map on it seems a bit of an abuse =P.
I challenge you to show why that's the case. I find that quite elegant.
Just for a little more context, the reason I brought this up was to essentially be able to see/build the entire query in one go and separate out the dynamic choice steps elsewhere
Sure, you can do this with a method that returns a table conditionally.
The fact that some logic might need to be distributed between join()
clauses and the where()
clause will make a truly dynamic approach a bit more tricky anyway. But if you can get away with putting your predicates all in the on()
clause, then it's really very simple using Table.join()
only.
Wrapping a fixed value just to call map on it seems a bit of an abuse =P.
I challenge you to show why that's the case. I find that quite elegant.
It's equivalent to just binding the anonymous functions you want to a variable and chaining them since you'll never have an empty optional.
final Function b = t -> b ? t.join(TABLE_B).on("b condition") : t; final Function c = t -> c ? t.join(TABLE_C).on("c condition") : t; final result = b.andThen(c).apply(TABLE_A)
Side note: we're using the awesome cyclops-react functional library, and their Option class (in addition to having way more functionality than the java 8 optional), doesn't actually have a get method to unpack the value (it's somewhat opinionated on not throwing exceptions), so you would have to handle the impossible case in a really awkward way.
and if you really, really want to not leak any variables to a different scope, then
joined = (t - > {
res_b = b ? t.join(TABLE_B).on("b condition") : t);
res_c = c ? res_b.join(TABLE_C).on("c condition") : res_b;
}).apply(TABLE_A);
Just for a little more context, the reason I brought this up was to essentially be able to see/build the entire query in one go and separate out the dynamic choice steps elsewhere
Sure, you can do this with a method that returns a table conditionally.
The fact that some logic might need to be distributed between
join()
clauses and thewhere()
clause will make a truly dynamic approach a bit more tricky anyway. But if you can get away with putting your predicates all in theon()
clause, then it's really very simple usingTable.join()
only.
This is the approach I'm looking into.
As an aside, I do want to thank you for
It's equivalent to just binding the anonymous functions you want to a variable and chaining them since you'll never have an empty optional.
You omitted the generics, which would make this approach much more verbose (in Java). Also, the order of tables is confusing now, IMO. But indeed, technically, they're equivalent, I'd say.
so you would have to handle the impossible case in a really awkward way.
Well, you could make an exception and use Optional
here (or is that verboten?). Or you use getOrElse(DSL.dual())
. Or you use a Stream
and reduce the joins.
Anyway, it's good feedback. I do wonder if there's something we could improve in the jOOQ API to make dynamic joins more "FP idiomatic". Especially when left joins are used as anti joins (A LEFT JOIN B ON .. WHERE B.ID IS NULL
), my suggestions won't work, as two conditional things need to be done separately: Adding the join and adding the WHERE
clause. A workaround would be to use the synthetic leftAntiJoin()
operation, but that produces NOT EXISTS
, when perhaps an actual LEFT JOIN
was desireable for RDBMS specific performance reasons.
We'll think about this.
As an aside, I do want to thank you for
Thanks for your nice words! Greatly appreciated :)
I still don't see any compelling API in this area that is better than the status quo, perhaps with the help of Optional
or Stream
... Keeping the issue open though, as a solution is definitely desireable.
Hey there, Lukas.
As a user of the jooq api and not looking into any of the implementation, I would like to consume it like this:
selectFrom(WALLPAPER)
.join(colorName == null? noJoin() : COLOR).on(WALLPAPER.COLOR_ID.eq(COLOR.ID))
.where(colorName == null? noCondition() : COLOR.NAME.eq(colorName))
This means noJoin() effectively also ignores the .on() filter. I guess this doesn't follow existing implementation, in which case dump this idea :)
Also, thanks for the library! Managed to start using it finally after creating a maven plugin for postgres support (jooq-codegen-postgres-maven-plugin, but that's offtopic).
Thanks for the suggestion, @liias. Interesting idea, so in this case, we'd maybe call it noTable()
, and indeed, the ensuing ON
or USING
clauses would then just be ignored (or users could pass an additional noCondition()
in a conditional expression).
It's ugly, but it could work. It would also work in a table list:
select(...)
.from(A, noTable(), B, noTable(), C);
The semantics of a noTable()
needs to adapt to an identity value of each respective join operation. The dee
table (from the dum dee table example) in addition to ignoring user join predicates seems to fulfil this requirement at first sight. It is a table with no columns and one row. If we join that table ON TRUE
, then the join operation has no effect, (which is the purpose of the dum/dee thought experiment) e.g.
t CROSS JOIN dee -- t
t INNER JOIN dee ON TRUE -- t
t LEFT JOIN dee ON TRUE -- t
t RIGHT JOIN dee ON TRUE -- t
t FULL JOIN dee ON TRUE -- t
Curiously, PostgreSQL also makes NATURAL JOIN
work, since the two tables have no columns in common, the implied ON
predicate is TRUE
as well, effectively turning this into a CROSS JOIN
.
t NATURAL INNER JOIN dee ON TRUE -- t
t NATURAL LEFT JOIN dee ON TRUE -- t
t NATURAL RIGHT JOIN dee ON TRUE -- t
t NATURAL FULL JOIN dee ON TRUE -- t
An empty USING ()
clause is not allowed in SQL, not in PostgreSQL either, although we should probably allow it: #10070. However, in the presence of a noTable()
, we'd simply ignore the columns passed to USING ()
, so we can make this work as well.
What about the other JOIN
types, we also have:
t CROSS APPLY dee -- t
t OUTER APPLY dee -- t
t CROSS JOIN LATERAL (SELECT * FROM dee) u -- t
t LEFT JOIN LATERAL (SELECT * FROM dee) u -- t
t LEFT SEMI JOIN dee -- t
t LEFT ANTI JOIN dum -- t (notice, here we need to implement dum semantics, not dee semantics)
When users do manual semi or anti joins using [ NOT ] IN
or [ NOT ] EXISTS
, they would probably already use noCondition()
in a conditional expression, e.g.
select(...)
.from(...)
.where(
something
? exists(...)
: noCondition()
);
Conclusion:
Yep. We can make a noTable()
happen. I'd also like to add some sort of explicit trueTable()
(dee, corresponding to trueCondition()
) and falseTable()
(dum, corresponding to falseCondition()
), but I don't think many database dialects support column-less tables. ~PostgreSQL supports them in CREATE TABLE
but almost nowhere else.~ (that's wrong. SELECT
is a valid statement in PostgreSQL)
There will be edge cases, but it can work.
Thorough response, some new terminology and join types to learn for me. I do think, however, it probably makes sense to wait for more feedback, as currently it's just us 2 users in this ticket.
Tell me if I got you correctly. Using join(noTable()) would still generate SQL then? Will they be real tables (i.e the dee/dum you mention) or something like CTE/temp tables?
I do think, however, it probably makes sense to wait for more feedback, as currently it's just us 2 users in this ticket.
Not sure, there's a pretty clear vision now, so I think this might pass if it's doable after looking into an implementation draft. This is not the only issue on this topic, there have been others, and also user group and stack overflow questions, as well as private emails. This is a recurring topic.
Tell me if I got you correctly. Using join(noTable()) would still generate SQL then? Will they be real tables (i.e the dee/dum you mention) or something like CTE/temp tables?
No, if there will be trueTable()
and falseTable()
, those would generate SQL. A noTable()
(just like the noCondition()
) will try to avoid generating any SQL, if it is syntactically valid to omit it.
Sounds good!
I just needed something like this myself, to add a "conditional union", i.e. union(condition ? select(A, B).from(T) : noSelect())
...
A noSelect()
method would be quite interesting, as it would need to be able to infer the <R extends Record>
type from where it is used (and Java often isn't too good with this). But I just tried this, and it worked as well!
System.out.println(
ctx.selectOne().union(null).union(null).union(selectOne())
);
Although, it doesn't work very well, e.g.
System.out.println(
ctx.selectOne().unionAll(null).union(null).union(selectOne())
);
...
Produces:
(
(
select 1 one
)
)
union (
select 1 one
)
An undocumented feature! Which means some people are probably already relying on it. It's not unreasonable to expect the jOOQ API to accept null
values for QueryPart
instances, and have this noTable()
, noCondition()
, etc. semantics in cases where null
cannot mean anything else. In case of Condition
or Field
, the meaning is a bit ambiguous, because a user could really mean to pass a SQL NULL
bind value, especially for Field
, a bit less obviously for Condition
.
This doesn't work yet for tables:
System.out.println(
ctx.select().from(table("t").crossJoin((Table<?>) null))
);
Yields:
Exception in thread "main" java.lang.NullPointerException
at org.jooq.impl.JoinTable.<init>(JoinTable.java:188)
at org.jooq.impl.JoinTable.<init>(JoinTable.java:180)
at org.jooq.impl.AbstractTable.join(AbstractTable.java:1183)
at org.jooq.impl.AbstractTable.crossJoin(AbstractTable.java:1444)
Of course, the implicit join feature could be a viable solution in some cases, when the join expression follows a to-one relationship:
ctx.select(TABLE_A.asterisk(), b ? TABLE_A.tableB().asterisk() : inline(""))
.from(TABLE_A)
.where(b ? bCondition : noCondition())
.fetch();
The mere presence of TABLE_A.tableB()
in the query will produce the optional join, so it can be safely ignored.
See: https://www.jooq.org/doc/latest/manual/sql-building/sql-statements/select-statement/implicit-join/
👍 for noXxx() - Functionality. I've been breaking my head over an elegant solution to have an optional dynamic join. Good to see I wasn't the only one.
Just in case: if this is about to-one joins, optional implicit joins can be an elegant solution https://www.jooq.org/doc/latest/manual/sql-building/sql-statements/select-statement/implicit-join/
I believe the implicit join isn't applicable to our case (we are joining four tables but selecting from only one of them), but that is actually an awesome feature! TIL
I believe the implicit join isn't applicable to our case (we are joining four tables but selecting from only one of them)
Can you show an example? In many cases, it's not that obvious to know whether it's applicable. Some queries can be rewritten for implicit joins to work
Sure! I had to modify the original code a bit (not sure I'm allowed to post it) but this is basically it. We have this method that finds all USERs who are COURSEPROVIDERs and match some additional condition that can be passed into the method, e.g. PROVIDER.PREMIUM.isTrue()
:
return dsl.select(USER.asterisk())
.from(PROVIDER)
.join(COURSEPROVIDER)
.on(COURSEPROVIDER.PROVIDER_ID.eq(PROVIDER.ID))
.join(USER)
.on(USER.ID.eq(COURSEPROVIDER.USER_ID))
.where(PROVIDER.DEACTIVATED.isFalse().and(condition))
.fetchInto(User.class);
And then there is another method with almost the same query, extended by a join on and a group by clause:
return dsl.select(USER.asterisk())
.from(PROVIDER)
.join(COURSEPROVIDER)
.on(COURSEPROVIDER.PROVIDER_ID.eq(PROVIDER.ID))
.join(USER)
.on(USER.ID.eq(COURSEPROVIDER.USER_ID))
.join(CONCEPT)
.on(CONCEPT.PROVIDER_ID.eq(PROVIDER.ID))
.where(PROVIDER.DEACTIVATED.isFalse().and(condition))
.groupBy(USER.ID)
.fetchInto(User.class);
The intention behind the second statement is "find all USERs who are COURSEPROVIDERs and have at least one CONCEPT".
I wanted to put this into one single method that's clear for my fellow colleagues to read and reusable. I was trying to give the client a way to pass the last, optional join, and the group by into the method - or another one, if they wish to do so, or none at all.
Hmm, I don't think these queries are correct / optimal. Both seem to produce cartesian products. The second one removes them again with GROUP BY
, but that might be far from optimal.
How about this instead?
return dsl.select(USER.asterisk()
.from(USER)
.where(USER.ID.in( // Use a semi-join to fix the cartesian products
select(COURSEPROVIDER.USER_ID)
.from(COURSEPROVIDER)
.join(PROVIDER).on(COURSEPROVIDER.PROVIDER_ID.eq(PROVIDER.ID))
.where(PROVIDER.DEACTIVATED.isFalse())
.and(condition) // Put the USER part in this condition in the outer query
.and(dynamicConceptCheck // Dynamic semi-join here
? PROVIDER.ID.in(select(CONCEPT.PROVIDER_ID).from(CONCEPT))
: noCondition())
))
.fetchInto(User.class);
Just in case, implicit joins can help, but are optional (as always):
return dsl.select(USER.asterisk()
.from(USER)
.where(USER.ID.in(
select(COURSEPROVIDER.USER_ID)
.from(COURSEPROVIDER)
.where(COURSEPROVIDER.provider().DEACTIVATED.isFalse()) // Implicit join here
.and(condition)
.and(dynamicConceptCheck
? COURSEPROVIDER.PROVIDER_ID.in(select(CONCEPT.PROVIDER_ID).from(CONCEPT))
: noCondition())
))
.fetchInto(User.class);
Awesome, thank you for having a look at this! So I take it I could also use three explicit semi-joins like so:
return dsl.select(USER.asterisk())
.from(USER)
.where(USER.ID.in(
select(COURSEPROVIDER.USER_ID)
.from(COURSEPROVIDER)
.where(COURSEPROVIDER.PROVIDER_ID.in(
select(PROVIDER.ID)
.from(PROVIDER)
.where(providerCondition)
.and(dynamicConceptCheck
? PROVIDER.ID.in(select(KONZEPT.PROVIDER_ID).from(KONZEPT))
: DSL.noCondition()))
)))
.and(userCondition)
.fetchInto(Benutzerprofil.class);
?
You could. Keep an eye on execution plans, just in case your RDBMS doesn't think the queries are equivalent (they should be, but not all optimisers are equal)
Looking into this now, in the context of a variety of such constructors that may be desirable for better dynamic SQL (#12969).
This one here is tricky. The NoTable
type will have to be able to join itself or be joined, and the resulting expression should no-op. E.g.:
assertEquals(T, noTable().join(T).on(condition));
assertEquals(T, T.join(noTable()).on(condition));
The tricky part is that the join expression may have optional on()
(or using()
, etc.) methods, so this must be possible, too:
assertEquals(T, noTable().crossJoin(T);
assertEquals(T, T.crossJoin(noTable());
And there are convenience and()
and or()
methods after on()
, such as:
assertEquals(T, noTable().join(T).on(condition)).and(otherCondition);
assertEquals(T, T.join(noTable()).on(condition)).and(otherCondition);
Hence, we can't get rid of the JoinTable
implementation in the expression tree, as users may append more clauses to it. This is different for NoCondition
, which we never keep around in the expression tree, so we don't have to constantly check if a condition is instanceof NoCondition
.
It's not impossible to implement this correctly, but again, it's much more tricky than NoCondition
I've had a look at this now for 3.19. It is indeed much more complicated than other noXYZ()
API elements, but not impossible.
A refactor of the JoinTable
hierarchy was necessary, and an auxiliary NoTableJoin
type was needed: i.e. when joining noTable().join(T)
, we can't just return T
. The resulting type must implement the JOIN
DSL so users can call on()
, using()
, partitionBy()
, etc. on the type, but once rendered, it must behave like the table itself. Also, if more joins are added to the join tree, NoTableJoin
must remove itself again from the query.
Use case:
We have a query with three variants. You can imagine the pseudocode/query to be like
where b and c are some parameters passed into the method.
Currently, we have to split the query building into multiple lines of code because there's no way to bind the statement
join(TABLE).on(...)
to something and there's no equivalent of DSL.noCondition for the join part.Possible solution you'd like to see:
I'd like to be able to write something like
Possible workarounds:
Right now we have something like
Please describe the possible workarounds you've implemented to work around the lacking functionality.
Versions:
None of this should really matter for this feature I think.