When publishing a data frame containing a column of nested data frames to Connect, the index.html seems to contain all of the data from the nested columns, resulting in problematically large files. On one actively used Connect instance, we found an index.html file that's over 1 GB.
I've included some sample code below that produces two pins. It produces two pins: one with a very tall data frame, and one with a much smaller nested data frame. I published to Connect to inspect the bundles to compare the relative sizes of the data in the .rds file and the preview index.html file.
beavers_tall: a pinned 5700000 x 4 data frame
beavers_tall.rds: 1.2 MB
index.html: 7 KB
beavers_nested: a pinned 250 x 2 data frame
beavers_nested.rds: 138 KB
index.html: 2.9 MB
It seems like any non-atomic columns should generate preview strings when serialized into the HTML preview.
Sample code
library(datasets)
library(dplyr)
library(magrittr)
board <- pins::board_connect(auth = "envvar")
# Big data frame
beaver_list <- beaver1 %>%
list %>%
rep(50000)
beavers <- dplyr::bind_rows(beaver_list)
pins::pin_write(board, beavers, name = "beavers_tall", description = "Beavers Tall")
# Nested data frame
# Just making the nested DF a little larger
wide_beav <- dplyr::bind_cols(beaver_list[1:10])
wide_beaver_list <- wide_beav %>%
list %>%
rep(250)
beavers_within_beavers <- data.frame(n = c(1:250))
beavers_within_beavers$beavers <- wide_beaver_list
pins::pin_write(board, beavers_within_beavers, name = "beavers_nested", description = "Beavers Nested")
When publishing a data frame containing a column of nested data frames to Connect, the
index.html
seems to contain all of the data from the nested columns, resulting in problematically large files. On one actively used Connect instance, we found anindex.html
file that's over 1 GB.I've included some sample code below that produces two pins. It produces two pins: one with a very tall data frame, and one with a much smaller nested data frame. I published to Connect to inspect the bundles to compare the relative sizes of the data in the
.rds
file and the previewindex.html
file.beavers_tall.rds
: 1.2 MBindex.html
: 7 KBbeavers_nested.rds
: 138 KBindex.html
: 2.9 MBIt seems like any non-atomic columns should generate preview strings when serialized into the HTML preview.
Sample code