fstpackage / fsttable

An interface to fast on-disk data tables stored with the fst format
GNU Affero General Public License v3.0
27 stars 4 forks source link

Stream csv file directly to a fsttable object #7

Open MarcusKlik opened 6 years ago

MarcusKlik commented 6 years ago

We will need a method csv_to_fst in the fst package (planned). By using that method under the hood, we can define a fsttable object from a csv file:

ft <- fst_table_from_csv("somebigfile.csv", columns = c("A", "B", "C"))
print(ft[, p_sum(A)])

This would calculate the sum of column A in a csv file that is possibly too large to read with currently existing methods. With this code snippet, the csv file would be streamed (in blocks) to a fst file first, and then a fsttable reference is returned.

(and a new file somebigfile.fst would also be created simultaneously)