bcgov / rems

An R package to access data from British Columbia's Environmental Monitoring System
Apache License 2.0
19 stars 5 forks source link

Speed up creation of db #32

Closed ateucher closed 5 years ago

ateucher commented 5 years ago

Add an autoincrementing Primary Key.

Possibly try something like:

dbExecute(conn, 
"CREATE TABLE foo(
  id INTEGER PRIMARY KEY,
  a REAL,
  b TEXT);")

dbAppendTable(conn, "foo", data.frame(a = rnorm(5), b = letters[1:5]), row.names = NULL)

dbGetQuery(conn, "SELECT * from foo;")

Also, dbWriteTable has a mechanism to write from a file - worth investigating

ateucher commented 5 years ago

Regarding importing directly from file... see here for why it won't work (at least for now)

ateucher commented 5 years ago

Could try depending on arkdb:

unark("ems.csv", db_con = conn, streamable_readr_csv(), tablenames = "test_arkdb", 
      col_types = col_spec(), locale = readr::locale(tz = ems_tz()))
ateucher commented 5 years ago

Another option is MonetDB-Lite - should be a drop-in replacement for SQLite, should be faster, and importing directly from csv should work.

ateucher commented 5 years ago

Another option here: https://github.com/mountainMath/statcanXtabs/blob/901f6b0bba28ff19effc16b086833342782d8e0b/R/xtabs.R#L11-L29.

ateucher commented 5 years ago

Possibly use vroom package to replace readr

ateucher commented 5 years ago

Made mostly unecessary by #37