Open Ten0 opened 6 days ago
= ANY (VALUES (ARRAY['foo']), (ARRAY['bar']))
works, so it looks like as a VALUES
subquery in the Many
specialization would also work (and maybe generate query plans that match better the = ANY (ARRAY)
case but not sure)...
The main question is: how to detect that you actually have an Array of Array.
I thought about:
ToSql
AST pass (notably for debug reasons)SqlType
trait.
a. We could have that as a type like is done for IsNullable
but that may not be worth the additional complexity (and/or breaking changes), considering that "associated type defaults are unstable
(https://github.com/rust-lang/rust/issues/29661)"... but maybe that's fine because that is mostly (always?) derived anyway?
b. It doesn't necessarily need to be a type though: even if it's just an associated const we could use it in the QueryFragment
implementation. If it's just an associated const we can't use that to restrict compilation of additional array nesting though, because "associated const equality is incomplete
(https://github.com/rust-lang/rust/issues/92827)", so if it's meant to actually be a type then we probably shouldn't do this...A workaround that uses the VALUES (...), (...)
writing:
use {
diesel::{
backend::Backend,
expression::{
array_comparison::{AsInExpression, MaybeEmpty},
AsExpression, Expression, TypedExpressionType, ValidGrouping,
},
prelude::*,
query_builder::{AstPass, QueryFragment, QueryId},
serialize::ToSql,
sql_types::{HasSqlType, SingleValue, SqlType},
},
std::marker::PhantomData,
};
/// Fixes `eq_any` usage with arrays of arrays:
/// https://github.com/diesel-rs/diesel/issues/4349
///
/// Usage: `column.eq_any(AsSingleColumnValueSubselect(iterator_of_arrays))`
pub struct AsSingleColumnValueSubselect<I>(pub I);
impl<I, T, ST> AsInExpression<ST> for AsSingleColumnValueSubselect<I>
where
I: IntoIterator<Item = T>,
T: AsExpression<ST>,
ST: SqlType + TypedExpressionType,
{
type InExpression = SingleColumnValuesSubselect<ST, T>;
fn as_in_expression(self) -> Self::InExpression {
SingleColumnValuesSubselect {
values: self.0.into_iter().collect(),
p: PhantomData,
}
}
}
#[derive(Debug, Clone)]
pub struct SingleColumnValuesSubselect<ST, I> {
/// The values contained in the `= ANY ()` clause
pub values: Vec<I>,
p: PhantomData<ST>,
}
impl<ST, I, GB> ValidGrouping<GB> for SingleColumnValuesSubselect<ST, I>
where
ST: SingleValue,
I: AsExpression<ST>,
I::Expression: ValidGrouping<GB>,
{
type IsAggregate = <I::Expression as ValidGrouping<GB>>::IsAggregate;
}
// This implementation is fake, it's not an expression, but it's used to check consistency between the SQL types
// of both sides of the `= ANY ()` clause.
impl<ST, I> Expression for SingleColumnValuesSubselect<ST, I>
where
ST: TypedExpressionType,
{
type SqlType = ST;
}
impl<ST, I> MaybeEmpty for SingleColumnValuesSubselect<ST, I> {
fn is_empty(&self) -> bool {
self.values.is_empty()
}
}
impl<ST, I, QS> SelectableExpression<QS> for SingleColumnValuesSubselect<ST, I>
where
SingleColumnValuesSubselect<ST, I>: AppearsOnTable<QS>,
ST: SingleValue,
I: AsExpression<ST>,
<I as AsExpression<ST>>::Expression: SelectableExpression<QS>,
{
}
impl<ST, I, QS> AppearsOnTable<QS> for SingleColumnValuesSubselect<ST, I>
where
SingleColumnValuesSubselect<ST, I>: Expression,
I: AsExpression<ST>,
ST: SingleValue,
<I as AsExpression<ST>>::Expression: SelectableExpression<QS>,
{
}
impl<ST, I, DB> QueryFragment<DB> for SingleColumnValuesSubselect<ST, I>
where
DB: Backend + HasSqlType<ST>,
ST: SingleValue,
I: ToSql<ST, DB>,
{
fn walk_ast<'b>(&'b self, mut out: AstPass<'_, 'b, DB>) -> QueryResult<()> {
if self.values.is_empty() {
return Err(diesel::result::Error::QueryBuilderError(
"Cannot generate an empty `= ANY (VALUES (...), (...))` clause - avoid making a query in that case"
.into(),
));
}
out.unsafe_to_cache_prepared();
out.push_sql("VALUES ");
let mut first = true;
for value in &self.values {
if first {
first = false;
} else {
out.push_sql(", ");
}
out.push_sql("(");
out.push_bind_param(value)?;
out.push_sql(")");
}
Ok(())
}
}
impl<ST, I> QueryId for SingleColumnValuesSubselect<ST, I> {
type QueryId = ();
const HAS_STATIC_QUERY_ID: bool = false;
}
I've pushed #4350 with another approach to fix that problem, but I'm open to different solutions.
As a more general note: I wonder if it would be meaningful (and possible) to have some sort of fuzzer that takes the documentation and keeps generating queries via the DSL and then checks if they compile and if they do not return an syntax error while executing them.
As a more general note: I wonder if it would be meaningful (and possible) to have some sort of fuzzer that takes the documentation and keeps generating queries via the DSL and then checks if they compile and if they do not return an syntax error while executing them.
This looks interesting. However I feel like it's probably fine if Diesel doesn't cover for mistakes people never make, and there ought to be some of those that the type system allows but the complexity of preventing isn't worth the probability of occurrence... If this is true, then I'm not sure what would be a strategy to create a general test system that would test only "relevant" cases but still be able to infer them automatically.
Setup
Versions
Feature Flags
Problem Description
array.eq_any(array_of_array)
doesn't work on Postgres, because Postgres doesn't have such a thing as "array of array". It has 2D array but "2-D array is a different concept". This causes an error, so we should probably use the non-specialized ANSIIN (...)
when dealing with arrays.What are you trying to accomplish?
What is the actual output?
What is the expected output?
Either of:
IN ($1, $2, $3)
with$1 = ["foo"]
....eq_any(
abc).ansi()
) to tell it to use ANSI serializationATM AFAICT there's no way to even tell it to use ANSI serialization because of bounds here: https://github.com/diesel-rs/diesel/blob/ce6fa0bd93877391665254d9bdb23a9c895902ef/diesel/src/expression/array_comparison.rs#L100-L103 which forbids the use of the ANSI implementation on
Pg
Backend
(which seems weird since ANSI should always work, no?)(NB: All of this is useful only to the extent to which indexing will work and this behaves reasonably even if there are 500 different values in there. I recall that we did tend to prefer
= ANY
maybe because of this but I may be mistaken here, was it only for statement caching ?)Steps to reproduce
https://github.com/Ten0/diesel/commit/d6c93fd2a8af99df9bc8dc362144e6f599c6030c
or dedicated working branch to fixing this: