OHDSI / Andromeda

AsynchroNous Disk-based Representation of MassivE DAta: An R package aimed at replacing ff for storing large data objects.
https://ohdsi.github.io/Andromeda/
11 stars 13 forks source link

Need to re-arrange ORDER BY? #25

Closed msuchard closed 3 years ago

msuchard commented 3 years ago

Below is a warning I am seeing in the Cyclops test-units. Is this an error in Andromeda? @schuemie

Warning (test-dataConversionStratified.R:60:3): Test stratified cox
ORDER BY is ignored in subqueries without LIMIT
ℹ Do you need to move arrange() later in the pipeline or use window_order() instead?
Backtrace:
  1. Cyclops::convertToCyclopsData(...) test-dataConversionStratified.R:60:2
  2. Cyclops::convertToCyclopsData.tbl_dbi(...) /Users/msuchard/Dropbox/Projects/cyclops/R/NewDataConversion.R:90:4
  3. Andromeda::batchApply(covariates, loadCovariates, batchSize = 1e+05) /Users/msuchard/Dropbox/Projects/cyclops/R/NewDataConversion.R:350:4
  5. dbplyr:::sql_render.tbl_lazy(tbl, connection)
  7. dbplyr:::sql_render.op(query$ops, con = con, ..., subquery = subquery)
  9. dbplyr:::sql_render.select_query(qry, con = con, ..., subquery = subquery)
 18. dbplyr:::sql_render.join_query(query$from, con, ..., subquery = TRUE)
 27. dbplyr:::sql_render.tbl_lazy(query$y, con, ..., subquery = TRUE)
 29. dbplyr:::sql_render.op(query$ops, con = con, ..., subquery = subquery)
 31. dbplyr:::sql_render.select_query(qry, con = con, ..., subquery = subquery)
 32. dbplyr:::dbplyr_query_select(...)
 33. dbplyr:::dbplyr_fallback(con, "sql_select", ...)
 35. dbplyr:::sql_select.DBIConnection(con, ...)
 37. dbplyr:::sql_query_select.DBIConnection(...)
 39. dbplyr:::sql_clause_order_by(con, order_by, subquery, limit)
 40. dbplyr:::warn_drop_order_by()
schuemie commented 3 years ago

Not really an Andromeda issue. Somewhere arrange() is being called (as lazy operation) on an Andromeda table, and then other operations are applied subsequently, which could very well change the order and so make the arrange() call have no effect.

What specific context is this warning occurring?

msuchard commented 3 years ago

The warnings are occurring here: https://github.com/OHDSI/Cyclops/blob/75ab6624719e7bf193e744ad6061962c7a008d5c/R/NewDataConversion.R#L350

There are number of (important) covariates <- covariates %>% arrange() before the batchApply but as far as I can see there are no arrange() after this point (i.e. in loadCovariates and further downstream).

Might I need to force executing of the earlier covariates <- covariates %>% arrange() statements? And, how do I do this?

msuchard commented 3 years ago

Solved by @schuemie here: https://github.com/OHDSI/Cyclops/issues/52