dplyr and the tidyverse focuses on small, in-memory datasets. This is the right place to start because you can’t tackle big data unless you have experience with small data. The tools you learn in this book will easily handle hundreds of megabytes of data, and with a little care you can typically use them to work with 1-2 Gb of data. If you’re routinely working with larger data (10-100 Gb, say), you should learn more about data.table. This book doesn’t teach data.table because it has a very concise interface which makes it harder to learn since it offers fewer linguistic cues. But if you’re working with large data, the performance payoff is worth the extra effort required to learn it.
@lucashertzog
I make this strong recommendation.
Readings: https://r4ds.had.co.nz/introduction.html?q=data.table#big-data
https://github.com/matloff/TidyverseSkeptic
Everything in the Skeptic is spot on. Agree agree agree