Closed tomaswindsor closed 11 years ago
Is there any workaround?
Well, it's mysql thing. If you dont want to restrict the seleciton, don't call where condition.
$selection = $db->table('book');
if ($bookIds)
$selection->where('NOT IN', $bookIds);
By saying it's mysql thing, you mean in other databases this works well? Either way the behavior of this for mysql has big WTF factor, especially when using subqueries (which are optimized internally by NDB for MySql - and therefore won't work, while they would have worked if left unoptimized). Is the solution using special constant really so unacceptable? I think it would solve most problems for now, and perhaps later mysql comes with a possibility how to resolve this in cleaner way...
Tohle je celkem známý problém, viz http://php.vrana.cz/operator-in-s-prazdnou-mnozinou.php#d-8096. Teoreticky workaround by existovat mohl.
Srr, you are right, it's not only mysql problem, it's also in postgre. Only workaround which I know is to pass an magic value, which can't be present. Such ~~NDB~NO~VALUES~~
, etc...
Alternatively, it is also possible to use subquery such as SELECT 0 FROM $table WHERE 0
, where $table
is a string containing name of any existing table (e.g. the currently accessed one). Unfortunately it is not possible to use the dummy table dual
here, as it seems to ignore the WHERE 0
part when used within the NOT IN
construct (at least in MySql).
The trick is that NULL
in comparison to anything else is always NULL
, not BOOL
. So, with NOT IN
you can use:
SELECT * FROM table WHERE COALESCE(id NOT IN (NULL), TRUE);
I'm sure on PostgreSQL only...
Wow! Works in MySQL too. I'm not sure if nette should add COALESCE automatically.
This is a better solution than subquery, because you do not have to care about subquery return type.
But when I take a look at SqlBuilder::addWhere()
, you must detect condition negation NOT
. After that, you must detect emptiness of parameters (already done). And when you know these two things, you can simply remove whole condition or replace it by TRUE/FALSE. So COALESCE
or IN (NULL)
is not longer needed.
Maybe I'm missing something, I'm a Nette\Database amateur.
@hrach What do you think https://github.com/milo/nette/commit/65f05ee514b48a76c5e88c5fe8e7a681dfcfeca4?
The COALESCE is good idea, but in presented form it can be used only when the left side of the IN construct is not NULL. You can e.g. have WHERE NULL NOT IN (NULL)
(or you can replace the first NULL with something that can be NULL) and then you expect it to filter out those rows where it is NULL, but COALESCE would change this behavior.
So maybe something like this would work better:
SELECT * FROM table WHERE (NOT ISNULL(expression) AND COALESCE (expression NOT IN (NULL), TRUE))
... where expression
contains whatever is on the left side of the IN, like in:
$db->table('table')->where("$expression NOT IN", $array_or_subquery);
or
$db->table('table')->where("NOT $expression IN", $array_or_subquery);
However all this relates to ANSI NULLS being enabled, but some RDBMS allow to disable it...
Well, I would avoid stripping condition and parameter analysis. There is no need to do it.
SELECT * FROM users WHERE COALESCE(userId IN (1), TRUE);
works as well as SELECT * FROM users WHERE COALESCE(NOT userId IN (NULL), TRUE);
It leads (at least) to better variable caching key.
$db->table('table')->where("$expression NOT IN", $array_or_subquery);
is definetelly NOT valid use of Nette\Database, not escaped $expression, IN keyword is duplicit, NOT only at the begining...
@hrach You didn't address the issue I presented at all. In your example, you work with primary key (userId) which you expect to be always non-NULL. If you consider cases with column that can contain NULL (or more generally an expression that evaluates to NULL) , you will realize COALESCE changes the behavior (if ANSI NULLS are enabled).
My PHP example was merely meant to illustrate what i mean by expression
. It was by no means meant to show how Nette\Database should be used. But your argument that it was not escaped is speculative, as I didn't specify whether the $expression was or wasn't pre-escaped (escaping was irrelevant for what I wanted to show). Also I fail to see any duplicate IN in my illustration. Both notations are valid btw: expression NOT IN (...)
and NOT expression IN (...)
.
Duplicit IN: see https://github.com/nette/nette/blob/master/Nette/Database/Table/SqlBuilder.php#L181
Oh, I get it now, maybe you should try to express you thoughts better :D Just write where('nullableColumn IN (NULL)')
. So @milo solution is right. Let's do it.
Ok, you were right about the duplicate IN, my bad.
As for the actual problem we are discussing, I am glad you got it now, but I am afraid solution of @milo is also not respecting ANSI NULLS behavior for operand left of the IN operator. I am afraid replacing the condition with boolean is not the way because of this, as only the database knows what the left operand will evaluate to (whether it will be NULL or something else) and therefore PHP can't resolve it this way.
What we need is to find a condition that could be used for empty list of values in combination with NOT IN and would work preferably for both ANSI NULLS enabled or disabled. Solution with COALESCE (when extended by check for NULL) might work for ANSI NULLS enabled, but what if they are disabled? So far I am inclined to solution with the special constant, but if there was another solution without described negative side effects, I would prefer that one.
From my point of view I would not deal with empty array as "NULL" in "IN". If you want to filter by NULL, you would use "array(NULL)".
@hrach I am not sure if you wanted to react on me in your last comment. In case you did, you probably missed the point again, in which case I suggest we discuss the matter in czech. I agree that adding more NULLs only complicates things here. Hence my inclination to solution using special constant or subquery that returns empty result set (whichever is more efficient).
Maybe I found the solution. For empty arrays, following replacement would occur:
WHERE expression NOT IN ()
would be replaced with: WHERE expression IS NOT NULL
and
WHERE expression IN ()
would be replaced with: WHERE FALSE
Or did I miss something?
Souhlas s češtinou :-)
Vše je o situacích:
1. expr IN ()
2. expr NOT IN ()
3. expr IN (NULL) // db's decision
4. expr NOT IN (NULL) // db's decision
5. expr IN (1, 2, 3) // db's decision
6. expr NOT IN (1, 2, 3) // db's decision
A celá naše diskuze je o tom, že se snažíme opravit SQL chování (syntax error) pro "prázdnou množinu", tedy případ 1 a 2.
Myslím, že 1. by mělo být vždy FALSE
, protože množina je prázdná a není v ní ani nějaká neznámá hodnota. A 2. situace vždy TRUE
ze stejného důvodu. Když si to uživatel bude chtít změnit, může si přidat další podmínku na přidání/odebrání NULL sloupců.
@tomaswindsor: @milo solution is right, because it no longer solve 1st / 2nd as 3rd / 4th expression - which is the cutting edge.
Máš pravdu. Z nějakého důvodu jsem měl za to, že když je expr NULL
, tak to v případech 1 a 2 vrací NULL
, ale teď jsem si ověřil, že to tak není. Jediné, co mě ještě napadá, že by mohl být problém, je situace, kdy expr by obsahovalo něco s vedlejším efektem, např. (@i:=@i+1)
. To by šlo vyřešit tak, že v případě 1 by se použilo (expr) AND FALSE
a v případě 2 (expr) OR TRUE
.
I made a pull request https://github.com/nette/nette/pull/869. If you agree, we can continue discuss there.
Currently, when using
$db->table('book')->where('NOT id', array())
or$db->table('book')->where('id NOT', array())
it will result in querySELECT * FROM book WHERE NOT ID IN (NULL)
orSELECT * FROM book WHERE ID NOT IN (NULL)
which will both return no rows because of special behavior ofNULL
. Same problem occurs for MySQL subqueries in$db->table('book')->where('NOT id', $subQuery)
, because Nette\Database performs optimization and replaces the actual subquery with its result, and if this result is empty, it will use(NULL)
.Solution could be replacing the
(NULL)
with some kind of query that returns an empty result (no idea about that one). Alternatively instead of empty list, we could use list containing special unique value, e.g.WHERE NOT ID IN ('~~NDB~NO~VALUES~~')
. Solution with simply removing whole condition in case ofNOT
is insufficient becauseNULL
can appear also on the left side of theIN
operator, and removing whole condition would then cause different behavior - imagine$db->table('book')->where('NOT NULL', array())
- I know this example doesn't look to be of much use, but keep in mind thatNULL
can be result of some sql function used in the condition as well...