Closed gforge closed 8 years ago
Not a bad idea. But I wonder how many other non-label-preserving functions are out there. Is it good to spend time on only one of many?
Frank
On 12/22/2015 06:52 AM, Max Gordon wrote:
I minor suggestion is to add label retention with stats::relevel using S3. I often use it before my regressions (e.g. using the most frequent level as reference) while I set my original levels before the descriptive tables. Here's an example:
set.seed(1) test_factor <- factor(sample(LETTERS[1:3],replace = TRUE,size = 20)) label(test_factor)<- "My test"
test_factor <- relevel(test_factor,ref = "B") label(test_factor)== "My test"
relevel.labelled <- function(x,...){ lbl <- label(x) x <- NextMethod() label(x)<- lbl return (x) } label(test_factor)<- "My test" test_factor <- relevel(test_factor,ref = "B") label(test_factor)== "My test"
In addition I sometimes also use factor() to drop non-used levels after subsetting. A similar function could perhaps be factor.labelled. I guess this wish-list can get rather extensive but this would at least cover my label() use 90% of the time.
— Reply to this email directly or view it on GitHub https://github.com/harrelfe/Hmisc/issues/37.
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of *Biostatistics* *Vanderbilt University*
I don't think finding every possible function is worth the time. I would rather add functions as requested and try to limit to the most popular set of packages, common functions in base & stats make though sense to look through if there are any functions where this would be useful. A quick look through the base & stats index suggests perhaps adding cut
, gsub
, iconv
, sub
& reorder
.
I'm not sure about that strategy because those functions are not label-preserving (and even more obviously are not units-preserving).
P.S. The rms package automatically uses the most frequent level as the reference cell when fitting regression models.
I agree that they are on the borderline necessary and may introduce misinterpretations. In my mind I would say that relevel
, reorder
and factor
fall into the functions that I would expect to be label-preserving. These are also those that one would use after generating table 1 and thus require active relabeling from the user.
The iconv
function shouldn't alter the meaning of the variable but I wouldn't expect it to be label-preserving. One would also most likely use it during the munging-step, i.e. before labeling. I guess this is something that us outside the English speaking countries are struggling with and you have probably not found it that useful ;-)
Didn't know that rms did that automatically - makes sense and I guess if I want to do a different comparison I could always rely on the contrast
function. Since the label function is part of the Hmisc and I (and probably many others) use it in non-rms context it seems reasonable to provide this functionality.
I added relevel.labelled which will be in the next release to CRAN. Thanks Max.
I minor suggestion is to add label retention with stats::relevel using S3. I often use it before my regressions (e.g. using the most frequent level as reference) while I set my original levels before the descriptive tables. Here's an example:
In addition I sometimes also use factor() to drop non-used levels after subsetting. A similar function could perhaps be factor.labelled. I guess this wish-list can get rather extensive but this would at least cover my label() use 90% of the time.