Open ricardo-barreira opened 11 months ago
Hi all, just wanted to follow up on this issue. Any updates would be greatly appreciated. Thanks!
Hi, I am facing the same error with another expectation:
expect_column_pair_values_to_be_equal expect_compound_columns_to_be_unique expect_select_column_values_to_be_unique_within_record
It will be really helpful to fix it
Same here, and it's a blocker, holding us back from integrating Great Expectations.
I've looked a bit into the code and noticed a few things that might help investigating.
First, as described above, the generated query contains the clause ... THEN CAST(%(param_1)s AS Decimal(None, None)) ELSE CAST(%(param_2)s AS Decimal(None, None)) ...
. ClickHouse obviously doesn't understand the None
s here, which are passed for precision
and scale
. As you can see in the clickhouse-sqlalchemy
driver code, there are no None
checks in the type compiler:
def visit_numeric(self, type_, **kw):
return 'Decimal(%s, %s)' % (type_.precision, type_.scale)
So one thing to discuss is whether or not these checks should be done in the driver or if sqlalchemy or Great Expectations should properly populate these fields.
Next thing I noticed: these casts are really just being done for a weird "true"/"false" conversion for Redshift: https://github.com/great-expectations/great_expectations/blob/f1c565980a6bc46e465134bba020d64fb8e9c81f/great_expectations/expectations/metrics/map_metric_provider/map_condition_auxilliary_methods.py#L305-L312 So the bug might not even manifest itself if it wasn't for this workaround.
We would highly appreciate if someone with deeper knowledge of GE and sqlalchemy could take a look at this 🙏
I managed to work around these issues by monkey-patching the ClickHouseTypeCompiler
for the Numeric type:
def _patched_visit_numeric(self, type_, **kw):
if type_.precision is None:
return "Decimal"
if type_.scale is None:
return "Decimal(%s)" % type_.precision
return "Decimal(%s, %s)" % (type_.precision, type_.scale)
ClickHouseTypeCompiler.visit_numeric = _patched_visit_numeric
This leads me to believe that it's indeed a shortcoming of the clickhouse-sqlalchemy
package. I'll raise the issue there and contribute this fix.
While we're at it, could we perhaps also change the clickhouse
extra to be sqlalchemy2 compatible here?https://github.com/great-expectations/great_expectations/blob/f5d4500fe434d3b25ee24a65321778b241039271/setup.py#L18-L35
It claims to support sqlalchemy 2 since release 0.3.0
(see Changelog). Otherwise, this will hold back any fixes we contribute to clickhouse-sqlalchemy
.
@Gerrit-K I wanted to ask, you still looking for contrib in clickhouse-sqlalchemy? If you are busy or not interested I can do it by myself because its blocking me too. About sqlalchemy 2 support for clickhouse, I think we need implement lots of tests before switching from 1.x, otherwise we can face same tons of bugs, clickhouse support still weak :(
Hey @matveykortsev, sorry for the delay. I initially planned to do this but then got caught up with other things. It'll take a week or two until I will be able to take a shot at it, so if you want to take over, you're more than welcome!
Regarding sqlalchemy 2, that's a fair point. I already assumed it might not be as easy as just moving the version definition :/
Description When using the expect_column_values_to_be_unique expectation with ClickHouse as the backend, I encountered an error related to the Decimal data type. The error message suggests there's an issue with how Great Expectations is generating the SQL query for this specific expectation. The issue seems to be related to the _sqlalchemy_window method in the column_values_unique.py Metric.
To Reproduce Run the following code:
Generated datasource in the great_expectations.yml:
Error stack trace:
Expected behavior The expectation should run without any errors.
Environment: