traversc / qs

Quick serialization of R objects
397 stars 19 forks source link

qs apparently slower than rds when saving nested lists #90

Closed mabuimo closed 5 months ago

mabuimo commented 7 months ago

I observed that when you write a double-nested list object as a pin and want to save it as qs (hoping for efficiency gains) instead of rds, it takes a substantial greater amount of time to both write the pin and read it when you pull it later.

However, if the list has only one level of nested objects, qs is faster than rds.

Unfortunately, I can't provide the reprex with the actual data because it is sensitive.

two_levels_nested_list<- tibble::lst(list_a, list_b, list_c)

tictoc::tic()
two_levels_nested_list |>  
pins::pin_write(
  board =my_board,
  name = "user/object",
  type = "rds", # or "qs"
  title = "Something",
  description = "Something)"
)
tictoc::toc()

# 60.741 sec elapsed - qs
# 3.638 sec elapsed -rds

Then I tested what happened if I just wrote a .rds or .qs file, without using pins:

> tic()
> lst(list_a, list_b, list_c) |> write_rds("a.rds")
> toc()
1.32 sec elapsed
> 
> tic()
> lst(list_a, list_b, list_c) |> qsave("b.qs")
> toc()
209.839 sec elapsed

I have also noticed that a.rds size is 97.7 MB while b.qs is 667.1 MB.

Thanks

traversc commented 7 months ago

It's hard to say without a reproducible example. One possibility is your lists have a lot of ALT-REP data (which CRAN does not allow qs to support). If you can mock up some data, I'd be happy to take a look.

traversc commented 5 months ago

Closing for now, please feel free to re-open any time