Open Jhanbanan opened 8 years ago
Hey guys - I'm seeing this issue also. I'm trying to join on 2 partitioned columns which are varchar(50), unfortunately it fails with the following error: ERROR: cannot perform local joins that involve expressions DETAIL: local joins can be performed between columns only
Hey @Pcummings,
Thank you for your interest in Citus. Until we fix this issue, you can use text columns instead of varchar columns as a workaround.
Could you post your table schema and an example query -- so that when we fix this issue, we can make sure that we addressed your issue? (edited the question)
Here's the list of places I found questionable in my code search:
modify_planner
QueryRestrictList
, I questioned whether the Const
or Var
might ever be a RelabelType
, but couldn't articulate howErrorIfQueryNotSupported
, I thought trying to SET
a VARCHAR
column in an UPDATE
expression might trigger the !IsA(targetEntry->expr, Const)
check and erroneously reject the commandmulti_logical_optimizer
TransformSubqueryNode
performs a IsA
check on groupByExpression
, which could have problems with relabel types. Might be triggered by a GROUP BY
on a VARCHAR
column, though maybe only in a subquery?WorkerAggregateWalker
also performs an IsA
check on a node. Might be triggered by ORDER BY
and aggregates of a VARCHAR
type columnAggregateDistinctColumn
will probably fail with relabel types (VARCHAR
) in an aggregate with a DISTINCT
clauseGroupedByColumn
looks like we might fail to push down VARCHAR
groupingsIsPartitionColumnRecursive
seemed to have a problem. I think #426 fixes itPartitionColumnOpExpressionList
seemed to have a problem. I think #426 fixes itmulti_physical_planner
CheckJoinBetweenColumns
has a problem; #426 fixes it.I believe #426 fixes something that wasn't even in the codebase when I initially researched #76, so more instances could have crept in during that interim period. Basically I searched for anywhere we did IsA(node, Var)
but didn't call strip_implicit_coercions
on node
first. In any of those places, it's possible a RelabelType wrapper is wrapping the underlying Var
, so the IsA
check will fail.
@aamederen, can you look over the places I mentioned above and possibly see whether you can trigger any of the codepaths? If so, can you try adding strip_implicit_coercions
calls to see whether the bugs can be easily fixed? These changes don't necessarily need to be pushed to #426, since that looks to fix something useful in an isolated way, but I don't want to lose track of the other places I discovered.
See this commit for the places I dropped TODOs back when I looked into this issue: 92b837281b9795622b451a9a1cdb8d1b3a350a95
@jasonmp85 @aamederen Could we separate these two into separate issues? It seems like we have some reproducible and known bugs, for which @aamederen has fixes in his PR. We also have other potential issues, which we first need to test if there are bugs at all.
While all related, I'd prefer to get some of these known bugs fixed first, and maybe keep this issue open as a high-level tracking one for the other potential problems.
Sounds OK?
While all related, I'd prefer to get some of these known bugs fixed first, and maybe keep this issue open as a high-level tracking one for the other potential problems.
Yup (I said as much above, though it was buried in the penultimate paragraph).
Sounds OK?
Yup.
Reopening to track other questionable places in the code. @aamederen will update here with his findings.
When we try to do large table joins on varchar columns, we get an error of the form: ERROR: cannot perform local joins that involve expressions DETAIL: local joins can be performed between columns only. This is because we have a check in CheckJoinBetweenColumns() which requires the join clause to have only 'Var' nodes (i.e. columns). Postgres adds a relabel type cast to cast the varchar to text; hence the type of the node is not T_Var and the join fails.