Closed koenniem closed 3 years ago
Hi. Yes, DBIConnection is among the class of objects that cannot be exported to another R process. You can read more about it, and see other examples in https://cran.r-project.org/web/packages/future/vignettes/future-4-non-exportable-objects.html.
Is this a limitation in doFuture or foreach? Is there a different kind of parallel looping function I can use with such a database pointer, e.g. future_apply
No, it applies to all types of parallel processing in R, not just the ones in the future framework. Unfortunately, there's no solution to this.
FWIW, note that foreach w/ doFuture, future.apply, and furrr are all map-reduce APIs that build on top of the future framework. So, using foreach() %dopar% { ... }
, future_lapply()
, or future_map()
is just a matter of taste - from a parallelization point of view they're all the same. For example, see my https://www.jottr.org/2020/12/19/future-eurobioc2020-slides/ talk:
Hope this clarifies it.
That's a great explanation, thanks! I didn't find that list yet (only the short explanation on non-exportable objects) butthat answers my question.
In case anyone else comes to this question from Google or the like, I worked around the problem by opening a new database connection inside of the foreach
loop and then closing it at the end. It's a tad slower, but I guess the only thing that works.
When using a database pointer inside a
foreach
loop (using thedoFuture
backend), the database pointer becomes invalid:Is this a limitation in
doFuture
orforeach
? Is there a different kind of parallel looping function I can use with such a database pointer, e.g.future_apply
? I suspect this is becauseplan(multisession)
creates workers in separate R sessions, rendering the pointer invalid but I'm not sure what I can do to resolve this.