hannes / MonetDBLite

MonetDB reconfigured as a library
108 stars 11 forks source link

Coercion of logical to double #204

Open tslumley opened 6 years ago

tslumley commented 6 years ago

In the MonetDBLite backend to dbplyr, both automatic and explicit coercion of logical to double fails. For example, with

library(tidyverse)
library(MonetDBLite)
ms<-src_monetdblite("/tmp")
iris<-copy_to(ms, iris)

The following both fail with an error message Server says 'ParseException:SQLparser:types boolean(1,0) and decimal(18,3) are not equal'

iris %>% mutate(is_setosa=Species=="setosa") %>% summarise(mean(is_setosa)) %>% collect()
iris %>% mutate(is_setosa=as.numeric(Species=="setosa")) %>% summarise(mean(is_setosa)) %>% collect()

Based on https://www.monetdb.org/pipermail/users-list/2013-May/006344.html I tried

iris %>% mutate(is_setosa=as.numeric(as.integer(Species=="setosa"))) %>% summarise(mean(is_setosa)) %>% collect()

which did work. Is it possible to get this work-around into the SQL translator? It doesn't look trivial because you don't want to use as.numeric(as.integer()) on something that's already double precision.

hannes commented 6 years ago

This looks to me like the implicit cast from BOOLEAN to NUMERIC is not supported on the SQL side. It works with an explicit cast though: iris %>% mutate(is_setosa=Species=="setosa") %>% summarise(mean(as.integer(is_setosa))) %>% collect() Interestingly, it did not work with as.numeric(). If this is to be fixed, that needs to happen deep down on the plan generation level.

The query that is generated is something like SELECT AVG("is_setosa") AS "mean(is_setosa)" FROM (SELECT "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species", "Species" = 'setosa' AS "is_setosa" FROM "iris") "diwurlskfe"