Open jangorecki opened 4 years ago
I collected some feedback about this task from our internal discussion.
Initially I will focus only on reading csv, not a binary formats.
For real world data NYT will be good first case, we should probably find one more popular dataset, to have two real world data.
For simulated data:
relevant issue https://github.com/Rdatatable/data.table/issues/2634
Reading data benchmark is on the roadmap. It should cover:
ideas for testing particular features (maybe advanced questions?)
feedback welcome