Open jerepow opened 6 years ago
@jerepow Adding SC at the beginning should be done when your column is a "character" and not when it is a "factor". Once you have declared a column as a factor, R goes through a process of learning the list of unique categories in that column (called "levels") and then becomes stubborn when you try to change what it learnt the levels to be. So declare it as a character first, do the replacement and then declare as a factor.
Thanks Varun. Still having issues unfortunately.
train.house$MSSubclass1 <- as.character(train.house$MSSubClass) train.house$MSSubClass = NULL replace(train.house$MSSubClass1, c("20", "30", "40", "45", "50" ,"60", "70", "75", "80", "85", "90", "120", "150", "160", "180", "190") ,c("1New", "1Old", "1Attic", "1.5Unfin", "1.5Fin", "2New", "2Old", "2.5", "Split", "Split Foyer", "Duplex", "1UnitNew", "1.5UnitNew", "2UnitNew", "UnitMulti", "2FamConv") )
Can you see any issues I'm missing?
Interestingly, it seems to be matching them up, but maybe there's a quirk in R that the output is going somewhere else?
Let's try the following:
train.house = train.house %>% mutate(MSSubClass = as.character(MSSubClass)
train.house = train.house %>% mutate(MSSubClass = paste0("SC",MSSubClass))
train.house = train.house %>% mutate(MSSubClass =as.factor(MSSubClass)
Typing from my phone, so commands might need a little fidgeting.
Just eyeballing the commands you used already earlier, it looks like you are not actually assigning the replacement to any variable. So it just ends up printing the replacement. Try this:
train.house$MSSubClass1 = replace(train.house$MSSubClass1, c("20", "30", "40", "45", "50" ,"60", "70", "75", "80", "85", "90", "120", "150", "160", "180", "190") ,c("1New", "1Old", "1Attic", "1.5Unfin", "1.5Fin", "2New", "2Old", "2.5", "Split", "Split Foyer", "Duplex", "1UnitNew", "1.5UnitNew", "2UnitNew", "UnitMulti", "2FamConv") )
I basically just added a text at the beginning telling R where the replacement must be assigned to. Does that make sense?
Well done doing that on your phone but still having issues:
and trying the mutate method is returning stmbol error issues:
For the second image, you are missing a closing bracket on the last two commands. In classifying as character. And also in the paste0 one.
Sent from my mobile. Please excuse brevity and typos.
On May 21, 2018 17:39, Jeremy Pownall notifications@github.com wrote:
Well done doing that on your phone but still having issues:
[image]https://user-images.githubusercontent.com/29014259/40316133-a32c6c5a-5d1d-11e8-9804-a9b9b947ddfa.png
and trying the mutate method is returning stmbol error issues:
[image]https://user-images.githubusercontent.com/29014259/40316191-cf858fca-5d1d-11e8-9026-a6051983e011.png
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/InseadDataAnalytics/INSEADAnalytics/issues/124#issuecomment-390692353, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALZ2PC36Ng-pUk7oJtgwlzLoqRKxkIBWks5t0t-bgaJpZM4UFtPs.
Trying to replace the integers 20, 30.... etc. to factors and then have an SC at the beginning.
Fixing incorrectly classified data types and renaming data points from integer codes to make more sense:
train.house$MSSubclass <- as.factor(train.house$MSSubClass) replace(train.house$MSSubClass, c(20, 30, 40, 45, 50 ,60, 70, 75, 80, 85, 90, 120, 150, 160, 180, 190), c("SC20", "SC30", "SC40", "SC45", "SC50", "SC60", "SC70", "SC75", "SC80", "SC85", "SC90", "SC120", "SC150", "SC160", "SC180", "SC190") )