Great work on the new (and more logically) named extension for reading in Excel files!!!
As a recovering R coder, I've used the R package janitor, which offers similar functions to sheetreader for reading in "cleaned up" Excel files. For example, its clean_names function will take the header row from an Excel worksheet, and it:
Parses letter cases and separators to a consistent format.
Default is to snake_case, but other cases like camelCase are available
Handles special characters and spaces, including transliterating characters like œ to oe.
Appends numbers to duplicated names
Converts “%” to “percent” and “#” to “number” to retain meaning
Spacing (or lack thereof) around numbers is preserved
My specific request for sheetreader-duckdb is the need for a clean_names equivalent in sheetreader to convert horrible Excel header names into nicely-formatted DuckDB columns names.
The ability to clean column names on import is particularly useful; my experience has been that it is really clunky to rename a large number of DuckDB columns post-import.
Great work on the new (and more logically) named extension for reading in Excel files!!!
As a recovering R coder, I've used the R package janitor, which offers similar functions to
sheetreader
for reading in "cleaned up" Excel files. For example, itsclean_names
function will take the header row from an Excel worksheet, and it:My specific request for
sheetreader-duckdb
is the need for aclean_names
equivalent insheetreader
to convert horrible Excel header names into nicely-formatted DuckDB columns names.The ability to clean column names on import is particularly useful; my experience has been that it is really clunky to rename a large number of DuckDB columns post-import.