HumanitiesDataAnalysis / hathidy

Download and manipulate HathiTrust wordcount data in the tidyverse
MIT License
9 stars 0 forks source link

Cache with parquet #2

Closed bmschmidt closed 3 years ago

bmschmidt commented 4 years ago

Parquet caches can be faster and more type-efficient than csv.gz ones. See this issue on how to sort them most efficiently.

bmschmidt commented 3 years ago

Now caching with feather instead, because unlike parquet it offers a native simple format to store the json metadata.