larmarange / labelled

Manipulating labelled vectors in R
https://larmarange.github.io/labelled/
GNU General Public License v3.0
73 stars 16 forks source link

Variable labels with packed column / label attribute on data frame #142

Closed krlmlr closed 1 year ago

krlmlr commented 1 year ago

In the dm package, we want to attach labels to table objects (https://github.com/cynkra/dm/pull/1888). Another use case is packed columns -- SPSS might not have them, still it's conceivable that users might want to attach labels to their packed columns.

The var_label() method for data frames iterates over columns by default, and returns the column labels. Would you consider offering a mode of operation where var_label() sets or returns the data frame's label?

larmarange commented 1 year ago

I'm not sure to have properly understand what you are looking for. Would you have a concrete example.

The labelled package follow the structure introduced by the haven package, where value labels and variable labels are attached to a vector and by extension a data frame column.

Hmisc or foreign packages are using an another way to store the same information.

The to_labelled() function could be used in such case to convert the information to the format covered by labelled

krlmlr commented 1 year ago

Thanks. I'd like to also attach labels to whole tables. Does the following example help?

library(labelled)
library(tibble)

my_data <- tibble(a = 1, b = 2)
var_label(my_data$a) <- "Column A"
var_label(my_data$b) <- "Column B"
# Is there a nicer syntax for this?
attr(my_data, "label") <- "My data"

# How can I retrieve the label for my_data?
var_label(my_data)
#> $a
#> [1] "Column A"
#> 
#> $b
#> [1] "Column B"

other_data_with_packed <- tibble(packed = my_data)
# Can the packed column have a label as well?
var_label(other_data_with_packed)
#> $packed
#> $packed$a
#> [1] "Column A"
#> 
#> $packed$b
#> [1] "Column B"

Created on 2023-06-17 with reprex v2.0.2

larmarange commented 1 year ago

Hi. I did not explore yet the issue of packed columns, and its raising several questions.

The philosophy of var_label() is to attached a label to a variable, i.e. to the column of a data.frame. Therefore, how should packed columns be considered: as one column or as a bunch of columns?

When extracting the list of labels of a data frame, what should be the default behaviour: providing the labels of the packed column or the list of labels of each sub-column?

In all cases, I can add label_attribute(), set_label_attribute() and get_label_attribute() functions that would work only at the level of the provided object.

For val_label(), an option could be proposed to be recursive or not for packed columns.

For adding a label to a group of pack columns, it should be easy to implement through set_variable_labels() because we are working at the level of the parent data.frame. However, it seems not possible with var_label() as we do not have a way to distinct a group of packed columns from a regular data.frame. But label-attribute() could be used instead.

Could you give some time to make some tests and to explore a little more the question?

krlmlr commented 1 year ago

Thanks. New top-level functions would be an option, what would be the difference between label_attribute() and get_label_attribute() ?

I'm not too interested in val_label(), did you mean var_label() ? It sounds like a "recurse" option would work here for both setting and getting the labels?

larmarange commented 1 year ago

Sorry I meant var_label().

get_label_attribute() would be a synomym of label_attribute(). set_label_attribute() would be equivalent to label_attribute()<- but returning th object rather than changing it.

krlmlr commented 1 year ago

Thanks, this sounds like a plan.

larmarange commented 1 year ago

You may have a look at #143

I included a dedicated vignette: https://github.com/larmarange/labelled/pull/143/files#diff-f6d6e8f6a27b790f3ab3a98c80620dd7bec1b0694cfa991dd2f989c29a1a1257

krlmlr commented 1 year ago

Thank you for your help!

krlmlr commented 1 year ago

Are you planning a CRAN release anytime soon?

larmarange commented 1 year ago

If you need it, I can plan a CRAN release

larmarange commented 1 year ago

It's on CRAN now