JuliaData / DataFrames.jl

In-memory tabular data in Julia
https://dataframes.juliadata.org/stable/
Other
1.74k stars 370 forks source link

Feature request: for metadata! to accept a dictionary as input #3469

Open alex-s-gardner opened 1 month ago

alex-s-gardner commented 1 month ago

To copy metadata between data frames (df1 and df2) requires:

md = DataFrames.metadata(df1)
for k in keys(md)
    DataFrames.metadata!(df2, k, md[k])
end

It would be convenient if metadata! could directly accept metadata Dict() output so that the above becomes:

DataFrames.metadata!(df2, DataFrames.metadata(df1))

I suspect this would also be handy for setting multiple metadata fields with a single Dict()

bkamins commented 1 month ago

Currently https://github.com/JuliaData/TableMetadataTools.jl is intended to support such operations as in DataAPI.jl we wanted to keep a minimal API that is required. Would this extra package meet your needs?

alex-s-gardner commented 1 month ago

To me its seems that simplifying the copying of metadata between data frames would be best handled in native DataFrames.jl as this is an entry point for most people. I would think that the change to code would be minimal as it would just be a simple DataFrames.metadata! dispatch on a dictionary. But you'll certainly have a more informed view on this than I.

pdeffebach commented 1 month ago

DataFramesMeta.jl already has TableMetaDataTools.jl as a dependency, fwiw. It's really just a matter of adding TableMetaDataTools to your environment.

alex-s-gardner commented 1 month ago

I guess the distinction between what belongs in what package is always a bit grey. To me it seems that copying of metadata between data frames would be rudimentary operation that should be supported alongside DataFrames.metadata in DataFrames.jl. The question is, is copying metadata a core operation when using DataFrames? If it is then the user should not be expected to load a new package to do the operation.. if not then the operation should be kept in DataFramesMeta.jl.