fstpackage / fst

Lightning Fast Serialization of Data Frames for R
http://www.fstpackage.org/fst/
GNU Affero General Public License v3.0
614 stars 42 forks source link

check column availability before reading #281

Open adviksh opened 5 months ago

adviksh commented 5 months ago

I've really enjoyed using fst — the read/write and compression are wonderful. Sometimes I create problems for myself when I try to read a column that doesn't exist in a dataset. For example:

library(fst)
test_df = data.frame(a = 1)
write_fst(test_df, 'test.fst')
read_fst('test.fst', columns='b')

produces the error:

 *** caught bus error ***
address 0x0, cause 'invalid alignment'

Traceback:
 1: fstretrieve(file_name, columns, from, to)
 2: read_fst("test.fst", columns = "b")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I'm running fst version 0.9.8 and fstcore version 0.9.18.

I'm proposing a fix that checks the requested columns against those found by metadata_fst(), and throws an error if the user requests a column not present in the file.

Apologies if this is already in the works! I did a quick scan of open issues but didn't see something like this.