Open asfimport opened 4 years ago
Antoine Pitrou / @pitrou:
I'm not sure I understand #1, can you explain a bit more?
As for #2, by giving ConvertOptions::include_columns
you can already restrict which columns you want to convert.
Neal Richardson / @nealrichardson: I didn't know about include_columns, thanks.
Here's two use cases for being able to get the column names without reading the whole table:
starts_with("something")
). In order to pass those to ConvertOptions::include_columns
, I need to get the column names from the CSV so that I can translate those.Antoine Pitrou / @pitrou: cc @westonpace
Weston Pace / @westonpace: It would probably be column_names and not schema. The table reader can do late inference so it may not know the final schema until the final table is read. But column_names should be pretty straightforward to add.
Todd Farmer / @toddfarmer: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.
Some feature requests:
column_names
method, and/orschema
method. This will (in most cases) require IO to get these from the file, but that's fine. There are use cases (we've seen in R) where it would help to be able to get the names from the file (e.g. when you specify column types, it's a map of column name to type, so you can't currently specify types without also specifying names)Add Read(std::vector) like how feather (and parquet?) have so that you don't have to parse and allocate columns you don't want.
cc @pitrou @romainfrancois
Reporter: Neal Richardson / @nealrichardson
Note: This issue was originally created as ARROW-10219. Please see the migration documentation for further details.