tidyverse / multidplyr

A dplyr backend that partitions a data frame over multiple processes
https://multidplyr.tidyverse.org
Other
641 stars 75 forks source link

problems with mulitidplyr and broom #9

Closed FabianRoger closed 5 years ago

FabianRoger commented 8 years ago

I can't reproduce the behaviour with a smaller dataset so I attach my original data. Sorry for that.

first the original code, that worked fine: myDF <- read.table("myDF.txt", sep = "\t")

myDF <- group_by(myDF, Sampling, BOT, Wells, rep) %>%
 summarize(maxOD = max(OD700)) %>%
 filter(maxOD >= 0.2) %>%
 inner_join(myDF, .)

fitDF <- myDF %>%
 group_by(Sampling, BOT, Wells, rep) %>%
 do(gompertz_fit = try(nls( OD700 ~ K * exp( -exp((( r * exp( 1)) / K) * (l - dhour) + 1)),
                    data = .,
                    start = list(K = 2, l = 30, r = 0.1)),
                    silent = T))

 filter(fitDF, class(gompertz_fit) == "nls") %>% tidy(., gompertz_fit)

no the same code but with multidplyr

myDF_par <- partition(myDF, Wells)

fitDF_t <- myDF_par %>%
  group_by(Sampling, BOT, Wells, rep) %>%
  do(gompertz_fit = try(nls( OD700 ~ K * exp( -exp((( r * exp( 1)) / K) * (l - dhour) + 1)),
                    data = .,
                    start = list(K = 2, l = 30, r = 0.1)),
                    silent = T))

fitDF_par <- collect(fitDF_t)

first thing that changed is that the following doesn't work any longer:

filter(fitDF_par, class(gompertz_fit) == "nls") 

although this is working

filter(fitDF_t, class(gompertz_fit) == "nls") 

for the former I found a workaround that does work

filter(fitDF_par, length( unlist (gompertz_fit)) > 1)

but tidy doesn't work with it

filter(fitDF_par, length( unlist (gompertz_fit)) > 1) %>% tidy(., gompertz_fit)

and this doesn't work either but probably for different reasons

filter(fitDF_t, length( unlist (gompertz_fit)) > 1) %>% tidy(., gompertz_fit)

I think the last one has to with the fact that the data are still distributed on the cores and may not be a bug (?). The other two might be?

FabianRoger commented 8 years ago

sorry, here is the file

myDF.txt

hadley commented 5 years ago

I'm closing this because there's no reprex, it's four years old, and multidplyr is about to get overhauled.