harrelfe / Hmisc

Harrell Miscellaneous
Other
208 stars 81 forks source link

Fix NA handling cleanup.import with charfactor #50

Closed bbbruce closed 8 years ago

bbbruce commented 8 years ago

NA's appear to be handled incorrectly cleanup.import when called with charfactor, creating a NA level in the new factor if the underlying character vector has NAs. This creates unexpected behavior in the resulting factors. Example:

dx <- data.frame(x = c("A", NA, NA, "B", "A", "A", "B"), stringsAsFactors = FALSE)
cx <- cleanup.import(dx, charfactor = TRUE)
levels(cx$x)
table(cx$x)  # should need option useNA to display the NA's

New behavior adds NA to the exclude.

harrelfe commented 8 years ago

Nice improvement. I made the same change for upData.

Frank


Frank E Harrell Jr Professor and Chairman School of Medicine

Department of Biostatistics Vanderbilt University

On Tue, Aug 2, 2016 at 3:28 PM, Beau Bruce notifications@github.com wrote:

NA's appear to be handled incorrectly cleanup.import when called with charfactor, creating a NA level in the new factor if the underlying character vector has NAs. This creates unexpected behavior in the resulting factors. Example:

dx <- data.frame(x = c("A", NA, NA, "B", "A", "A", "B"), stringsAsFactors = FALSE)cx <- cleanup.import(dx, charfactor = TRUE) levels(cx$x) table(cx$x) # should need option useNA to display the NA's

New behavior adds NA to the exclude.

You can view, comment on, or merge this pull request online at:

https://github.com/harrelfe/Hmisc/pull/50 Commit Summary

  • Fix NA handling cleanup.import with charfactor
  • Merge pull request #1 from bbbruce/bbbruce-patch-1

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/harrelfe/Hmisc/pull/50, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGO2oI1cLDPjcxRk9LDkJEcZt6WkAzOks5qb6hngaJpZM4JbA_r .