Closed felipeangelimvieira closed 3 years ago
Hi @felipeangelimvieira . We have aprox 3399 files available in geobr
, so it would not be possible to include all of them in the package due to size constraints imposed by CRAN policy. I'm now investigating which data sets are the most popular ones and then I'll check whether we can include a few of them in the package.
I assume the most frequently used data sets are municipalities and estates in 2010, but we'll see.
@JoaoCarabetta , do you know how this could be done for the Python version?
We can store some data with the package. Then, I just need to tweak the download function to choose the cached data instead.
We can store some data with the package. Then, I just need to tweak the download function to choose the cached data instead.
good to know. Once I decide on the the data sets we can include in the package I'll post an update here.
Just for a test, I've saved the municipality data with simplified borders a compressed .rda
data and the file is over 9MB. This too large. CRAN policies require that the package is up to 5MB max.
library(geobr)
library(tools)
df <- read_municipality(code_muni = 'all', simplified = T, showProgress = T)
save(df, file = 'munis_2010.rda', compress='xz',compression_level = 9)
checkRdaFiles('.')
> size ASCII compress version
> ./munis_2010.rda 9774744 FALSE xz 3
I see... so relying on the server may be the only option. I don't know the details about the IT infrastructure of IPEA, but using an blob storage such as Azure Blob Storage or Amazon S3 may avoid instability problems.
We have considered those options, but using Ipea IT infrastructure is cheaper and it gives more speed/felixibility to make any data updates / fixes. Our IT staff have made a few updates recently, so I hope we won't be facing any instabilities any time soon.
Closing this issue considering these results below
Just for a test, I've saved the municipality data with simplified borders a compressed
.rda
data and the file is over 9MB. This too large. CRAN policies require that the package is up to 5MB max.library(geobr) library(tools) df <- read_municipality(code_muni = 'all', simplified = T, showProgress = T) save(df, file = 'munis_2010.rda', compress='xz',compression_level = 9) checkRdaFiles('.') > size ASCII compress version > ./munis_2010.rda 9774744 FALSE xz 3
Hello, Thank you for the package, it has been very useful to me.
For the past few weeks, the servers have been unstable and some functions such as read_state aren't working as expected. Isn't it possible to keep those dataframes as binary data in the R package?
Reference: https://r-pkgs.org/data.html?q=data#data-sysdata section 14.2
Thank you