DeclareDesign / fabricatr

fabricatr: Imagine Your Data Before You Collect It
https://declaredesign.org/r/fabricatr
Other
92 stars 11 forks source link

fab_ID_1 #168

Closed graemeblair closed 5 years ago

graemeblair commented 5 years ago

Times when fabricate unexpectedly creates a variable fab_ID_1:

my_data <- data.frame(ID = 1:5, some_var = rnorm(5))

# has fab_ID_1
fabricate(data = my_data, add_var = 5)

my_data <- data.frame(ID = as.character(1:5), some_var = rnorm(5))

# has fab_ID_1
fabricate(data = my_data, add_var = 5)

my_data <- data.frame(ID = as.character(1:5), some_var = rnorm(5), stringsAsFactors = FALSE)

# *does not* have fab_ID_1
fabricate(data = my_data, add_var = 5)

test_df_does <- data.frame(ID = "5", S_inclusion_prob = 0.2, stringsAsFactors = FALSE)

# has fab_ID_1
fabricate(data = test_df_does, my_var = 5)

we need to rewrite that part of the code, so it's a bit smarter.

nfultz commented 5 years ago

IIRC, you can tell fabricate that your data frame already has an ID using the ID_label argument.

Right now, the default code path when ID_label is not provided uses identical() to check whether to drop a duplicated ID column. So factors are not identical to characters and it stays .