apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.37k stars 1.2k forks source link

Coercion from `Dictionary(UInt32, Boolean)` to `Boolean` #12511

Open samuelcolvin opened 2 months ago

samuelcolvin commented 2 months ago

I'm trying to run a query like

select count(*) from records where json_data_colum ? 'foo' and true

With datafusion and getting an error:

type_coercion caused by Error during planning: Cannot infer common argument type for logical boolean operation Dictionary(UInt32, Boolean) AND Boolean

(note in our case json_data_colum is a dictionary column containing JSON strings, so json_data_colum ? 'foo' returns a Dictionary(UInt32, Boolean))

This is related to https://github.com/apache/datafusion/pull/12382 where @adriangb fixed the case of filtering on a dictionary column.

I'll fix this specific case in https://github.com/datafusion-contrib/datafusion-functions-json, but I guess this should work.

alamb commented 2 months ago

I think type coercion is handled in DataFusion -- do you mind if I move this ticket to the datafusion repo?

adriangb commented 2 months ago

Please go ahead 😄

FWIW I think this is another case where the argument could be made that the producer of said array should do the coercion because a dict of bools never makes sense to create in the first place.

alamb commented 2 months ago

I moved the ticket -- and I agree I would suggest spending time updating the producer to avoid Dict(bool) rather than having DataFUsion do the coercion.