leeper / csvy

Import and Export CSV Data With a YAML Metadata Header
57 stars 3 forks source link

Address column mismatches #1

Closed leeper closed 8 years ago

leeper commented 8 years ago

Copied from https://github.com/leeper/rio/issues/110 (@billdenney):

My method for generating .csvy files is via Perl, and the header output order may not match the file output order exactly.

It would be helpful if the fields were matched by column name and the fields 'name' value rather than simply in order.

I think this would just be a change to the following code in .import.rio_csvy (lines 122-124 of import_method.R, currently):

for (i in seq_along(y$fields)) {
    attributes(out[, i]) <- y$fields[[i]]
}

becomes

already.matched <- rep(FALSE, ncol(out))
for (i in seq_along(y$fields)) {
  idx.match <- (1:ncol(out))[names(out) %in% y$fields[[i]]$name]
  if (length(idx.match) == 0) {
    warning("Field name ", y$fields[[i]]$name, " is not found in the input file; please check your YAML header.")
  } else if (length(idx.match) > 1) {
    warning("Field name ", y$fields[[i]]$name, " is found more than once in the input file; please check your .csv header.")
  } else if (already.matched[idx.match]) {
    warning("Column ", idx.match, " already has a field name match; please check your YAML header.")
  }
  attributes(out[, idx.match]) <- y$fields[[i]]
}
leeper commented 8 years ago

@billdenney: I've just sent an update to GitHub for this. Can you try again using csvy directly? I'll push this into rio momentarily.

billdenney commented 8 years ago

This one is fixed!

But, the new version has some limitations on column naming. I'll open a new issue for that.

leeper commented 8 years ago

Great!