Closed MaxGhenis closed 9 years ago
I'm not convinced this is a good idea, and if implemented it would need to come with a plethora of warnings.
This doesn't seem like a good idea to me as well. If you need to, then you can always use column numbers instead of names.
Unless there's a strong reasonable explanation for this, we should close this.
My use case involves looping through lots of raw datasets, and applying a single column name mapping to clean each in a consistent way. Something like
l <- list(...) # List of raw data.tables, which don't all have the same set of columns
raw.names <- c(...) # Names showing up in raw data
clean.names <- c(...) # Clean names
for (dt in l) setnames(dt, raw.names, clean.names)
Column numbers wouldn't work for me. I agree a warning for each skipped column makes sense.
You just have to add 1 more line...
for (dt in l) {
ix = match(names(dt), raw.names, 0L)
setnames(dt, raw.names[ix], clean.names[ix])
}
Thanks, that's cleaner than my intersect
/ %in%
approach above. Listed as an answer on the SO question. I'm fine closing if this need is unusual.
Max, glad that helped. :+1:
This is now implemented directly as setnames(..., skip_absent=TRUE)
.
Closed in https://github.com/Rdatatable/data.table/pull/3111, the other issue was #3030
A
allow.absent.cols
option forsetnames
would facilitate cases when the user wants to apply a column mapping without necessarily knowing whether all columns will exist. I'm currently using the following workaround, from my SO self-answer: