BoulderCodeHub / Process-CRSS-Res

Repository for code used to process CRSS Results starting in April 2015
1 stars 3 forks source link

speed up reading and writing csv/txt files #31

Closed rabutler closed 7 years ago

rabutler commented 8 years ago

RiverSMART can write csv files now, and RWDataPlot also writes out large txt files. Reading them in can be slow with utils::read.csv()

For a 1.625 GB csv file with 20,338,560 rows the read times on DirtyDevil are: utils::read.csv() - 117.56 seconds data.table::fread() - 16.2 seconds readr::read_csv() - 36.96 seconds

We should use one of these or readr::read_tsv() for the txt files to increase speed.

rabutler commented 8 years ago

if using data.table make sure having it be a data.table does not change anything. Don't think it should.

rabutler commented 8 years ago

See http://blog.h2o.ai/2016/04/fast-csv-writing-for-r/ for speeding up the write step.