pbeshai / tidy

Tidy up your data with JavaScript, inspired by dplyr and the tidyverse
https://pbeshai.github.io/tidy
MIT License
725 stars 21 forks source link

Set colum index for mutations #68

Open mhkeller opened 1 year ago

mhkeller commented 1 year ago

Does anyone else run into the problem where you want to add a new column via mutate but you don't want that new key to be added at the very end? One common example is I have a bunch of state fips codes that are my first column and I add the full state name via a lookup with a mutate call. I'd like to have that full state name then be the second column so my spreadsheet is easier to read. I could do a select / pick call but I'd have to write out all of my columns and that is a bit verbose.

Perhaps mutate could be supplied an index and it inserts the key at that index? Open to other workarounds people have found for this...

pbeshai commented 1 year ago

You can use selectors with select to make this less painful, e.g.:

tidy(input,
  mutate({ foo: 'foo' }),
  select(['foo', everything()])
);

This puts foo as the first key. Does that work for you?

mhkeller commented 1 year ago

That's interesting but I don't think it covers my use case. I wanted to insert foo as the second column. This may sound like an absurdly nitpicky requirement but it was the most logical ordering of the sheet. I was also converting something from d3-nest to use tidjs and i wanted the outputs to match exactly so i could easily ensure the conversion was successful.

mhkeller commented 1 year ago

Instead of adding this as an option to mutate maybe there's another tidy function called reorder or move that would allow you to do specify a column name and its new index like:

reorder('foo', 1)

pbeshai commented 1 year ago

I see! You can use the function API to accomplish this with select. e.g.:


function firstNKeys(n) {
  return items => Object.keys(items[0]).slice(0, n)
}

output = tidy(input,
  mutate({ foo: 'foo' }),
  select([firstNKeys(1), 'foo', everything()])
);

Or


function myCustomKeys(items) {
  const keys = Object.keys(items[0])
  return [keys[0], 'foo', ...keys.slice(1)]
}

output = tidy(input,
  mutate({ foo: 'foo' }),
  select([myCustomKeys])
);
mhkeller commented 1 year ago

Thanks for these examples. In your first example, would the first key be duplicated when you call everything() or does it somehow know not to include it?

I think I would still find it easier and more readable to have it in some kind of tidy function but I understand if this is out of scope for what you want to do.

pbeshai commented 1 year ago

Once a key has been selected, it stays in that spot, so there's no issue duplicating it with everything(). We could add firstKeys and lastKeys as selectors, but they'd operate identically to the firstNKeys example I put above, so I'd recommend for now just making that a utility function you use in your code base. If there's sufficient other interest we can add it, but it seems a bit niche to me at least right now.

mhkeller commented 1 year ago

Sounds good