JuliaData / TableOperations.jl

Common table operations on Tables.jl interface implementations
Other
47 stars 9 forks source link

Add rename/rename! of columns #16

Closed juliohm closed 7 months ago

juliohm commented 3 years ago

Would a function like DataFrames.rename be welcome in Tables.jl? I can try submitting a PR in the affirmative case.

quinnj commented 3 years ago

I don't think this makes sense for Tables.jl; the Tables.jl interfaces are very much centered around the producer-consumer workflow; i.e. making generic input tables consumable, allowing generic sinks to work on any input. We haven't supported any kind of mutation at all since there are a lot of generic "table" types that couldn't support it.

For these kind of data processing tasks, we've been putting functionality in TableOperations.jl; there we already have a generic Tables.map, that could be used like newtbl = oldtbl |> TableOperations.map(x -> (newcol=x.oldcol,)), but that's not ideal if you have lots of columns.

So we could potentially have a dedicated oldtbl |> TableOperations.rename(:oldcol => :newcol) |> DataFrame that lazily did the column renaming. We could probably just match the API that DataFrames provides in terms of taking pairs of oldcol => newcol, or providing a whole new set of names. This should be a not-too-hard kind of exercise for whoever wanted to take a stab at it. In the mean time, I'm moving this issue to TableOperations.jl where we can discuss further.

juliohm commented 3 years ago

Awesome. I will give it a try. The plan is to introduce a new table with the columns renamed. I can also try to make it lazy as suggested.

juliohm commented 7 months ago

We now have TableTransforms.Rename, which works for Tables.jl and is actively maintained.