Closed krlmlr closed 1 year ago
I'm not sure to have properly understand what you are looking for. Would you have a concrete example.
The labelled package follow the structure introduced by the haven package, where value labels and variable labels are attached to a vector and by extension a data frame column.
Hmisc or foreign packages are using an another way to store the same information.
The to_labelled() function could be used in such case to convert the information to the format covered by labelled
Thanks. I'd like to also attach labels to whole tables. Does the following example help?
library(labelled)
library(tibble)
my_data <- tibble(a = 1, b = 2)
var_label(my_data$a) <- "Column A"
var_label(my_data$b) <- "Column B"
# Is there a nicer syntax for this?
attr(my_data, "label") <- "My data"
# How can I retrieve the label for my_data?
var_label(my_data)
#> $a
#> [1] "Column A"
#>
#> $b
#> [1] "Column B"
other_data_with_packed <- tibble(packed = my_data)
# Can the packed column have a label as well?
var_label(other_data_with_packed)
#> $packed
#> $packed$a
#> [1] "Column A"
#>
#> $packed$b
#> [1] "Column B"
Created on 2023-06-17 with reprex v2.0.2
Hi. I did not explore yet the issue of packed columns, and its raising several questions.
The philosophy of var_label() is to attached a label to a variable, i.e. to the column of a data.frame. Therefore, how should packed columns be considered: as one column or as a bunch of columns?
When extracting the list of labels of a data frame, what should be the default behaviour: providing the labels of the packed column or the list of labels of each sub-column?
In all cases, I can add label_attribute()
, set_label_attribute()
and get_label_attribute()
functions that would work only at the level of the provided object.
For val_label()
, an option could be proposed to be recursive or not for packed columns.
For adding a label to a group of pack columns, it should be easy to implement through set_variable_labels()
because we are working at the level of the parent data.frame. However, it seems not possible with var_label()
as we do not have a way to distinct a group of packed columns from a regular data.frame. But label-attribute()
could be used instead.
Could you give some time to make some tests and to explore a little more the question?
Thanks. New top-level functions would be an option, what would be the difference between label_attribute()
and get_label_attribute()
?
I'm not too interested in val_label()
, did you mean var_label()
? It sounds like a "recurse" option would work here for both setting and getting the labels?
Sorry I meant var_label()
.
get_label_attribute()
would be a synomym of label_attribute()
.
set_label_attribute()
would be equivalent to label_attribute()<-
but returning th object rather than changing it.
Thanks, this sounds like a plan.
You may have a look at #143
I included a dedicated vignette: https://github.com/larmarange/labelled/pull/143/files#diff-f6d6e8f6a27b790f3ab3a98c80620dd7bec1b0694cfa991dd2f989c29a1a1257
Thank you for your help!
Are you planning a CRAN release anytime soon?
If you need it, I can plan a CRAN release
It's on CRAN now
In the dm package, we want to attach labels to table objects (https://github.com/cynkra/dm/pull/1888). Another use case is packed columns -- SPSS might not have them, still it's conceivable that users might want to attach labels to their packed columns.
The
var_label()
method for data frames iterates over columns by default, and returns the column labels. Would you consider offering a mode of operation wherevar_label()
sets or returns the data frame's label?