hannes / MonetDBLite

MonetDB reconfigured as a library
108 stars 11 forks source link

Random characters in queries #183

Closed RCura closed 6 years ago

RCura commented 7 years ago

When adding multiple filters from dplyr/R MonetDBLite adds random characters in the queries :

library(MonetDBLite)
library(dplyr)

dbdir <- file.path(tempdir(), "dplyrdir")
my_db_monetdb <- MonetDBLite::src_monetdblite(dbdir)
flights <- nycflights13::flights
flights$time_hour <- as.numeric( flights$time_hour )
flights_monetdb <- copy_to(my_db_monetdb, flights,overwrite = TRUE, temporary = FALSE)

flights_monetdb %>%
  filter(month == 1) %>%
  filter(day == 1) %>%
  show_query()

<SQL>
SELECT *
FROM (SELECT *
FROM "flights"
WHERE ("month" = 1.0)) "ptxhshcbir" # this string changes every time,  randomly
WHERE ("day" = 1.0)

MonetDBLite::monetdblite_shutdown()

Here, it's not a real problem as the query can still work. But when adding more filters, it becomes really messy :

flights_monetdb %>%
  filter(month == 1) %>%
  filter(day == 1) %>%
  filter(origin == "EWR") %>%
  filter(air_time <= 100) %>%
  show_query()

<SQL>
SELECT *
FROM (SELECT *
FROM (SELECT *
FROM (SELECT *
FROM "flights"
WHERE ("month" = 1.0)) "quakgfkwnr"
WHERE ("day" = 1.0)) "zvksmlgizc"
WHERE ("origin" = 'EWR')) "ptlzjexaos"
WHERE ("air_time" <= 100.0)

and in queries that I work on, this doesn't work at all anymore : the query isn't correct anymore and raises errors.

When filtering in multiple attributes in one filter, it works as it should :

flights_monetdb %>%
  filter(month == 1, day == 1, origin == "EWR", air_time <= 100) %>%
  show_query()

<SQL>
SELECT *
FROM "flights"
WHERE (("month" = 1.0) AND ("day" = 1.0) AND ("origin" = 'EWR') AND ("air_time" <= 100.0))

In my application, I have to rely on multiple filters put on multiple objects, and can't write the whole filter at once. So, this is a really big problem for my app :/

hannes commented 7 years ago

This is actually expected behaviour by dplyr, these random characters are names for the subquery result. Could you try to make a reproducible example of queries that don't work any more because that should not happen.

RCura commented 7 years ago

Can't reproduce the bug: :/ I changed a lot of things considering the way I call my queries, so, must have been a bad combination of DBI/dbplyr/MonetDBLite.

Especially, you might want to investigate the differences of the way R is connecting to the database. MonetDBLite::src_monetdblite("path") and DBI::dbConnect(MonetDBLite::MonetDBLite(), "path") produce different objects, without the same capacities.

For example, you can't disconnect a connection made through src_monetdblite(), and thus you need to shutdown the server using MonetDBLite::monetdblite_shutdown() . Also, some connectors allow uppercase tables and fields while some others don't and fail/produce a corrupted DB.

The results are also different considering the way to copy data in the db : copy_to and DBI::dbWriteTable().

I don't have the time to investigate later on this right now, but having one good and supported way to use MonetDBLite would for sure be an improvement.

Anyway, I finally succeeded using MonetDBLite as a backend for a shiny app dealing with lots of data, and that's a huge improvement in performance and possibilities over the previously used SQLite. Kudos and cheers !