deleted data not used in the package

kbenoit / sophistication

R package associated with Benoit, Munger and Spirling (2017) paper(s)

42 stars 7 forks source link

deleted data not used in the package #4

Closed kmunger closed 3 years ago

kbenoit commented 6 years ago

This would make replicating the chapter impossible... Since this is only a CRAN issue, let this sit while I think about the best way to offload them. Also to check that the Google corpus is still not going to break the 5MB limit.

kmunger commented 6 years ago

Ok sounds good. I can make a separate repository for replicating the chapter?

And the google corpus looks like it's only 3.7mb ; if we're parsimonious with the rest, that should be fine.

kbenoit commented 6 years ago

Best would be to make a replication Rmd file for the chapter, and see if that works as a vignette. We can then move that and the data objects on which it depends to a companion package that depends on sophistication, and that will not be on CRAN.

kmunger commented 6 years ago

The chapter file reads in the SCOTUS and CR data, which are about half a gig combined. I could considerably downsample these and put them on the main sophistication package, or we could put the larger files in the seperate repository.

kbenoit commented 6 years ago

OK, that's too big for a package data object, even a non-CRAN one. We can either park them on a server and access them using

load("http://wherethedatais.server.com/bigassdataobkect.Rdata")

or use the new download function in https://github.com/quanteda/quanteda.corpora.